In the discrete-SAC paper, the temperature loss in Eq. (11) indicates that the direct expectation should be calculated rather than the Monte-carlo estimate, the same logic as Eq. (10). The implementation however, still calls the calculate_entropy_tuning_loss in SAC.py using .mean().
In the discrete-SAC paper, the temperature loss in Eq. (11) indicates that the direct expectation should be calculated rather than the Monte-carlo estimate, the same logic as Eq. (10). The implementation however, still calls the
calculate_entropy_tuning_loss
inSAC.py
using.mean()
.