p-christ / Deep-Reinforcement-Learning-Algorithms-with-PyTorch

PyTorch implementations of deep reinforcement learning algorithms and environments
MIT License
5.65k stars 1.2k forks source link

SAC Discrete needs it's own `calculate_entropy_tuning_losses` function? #75

Closed Harimus closed 3 years ago

Harimus commented 3 years ago

So while checking the SAC_Discrete code I noticed the lack of calculate_entropy_tuning_losses function, which it inherit from SAC.

But according the SAC_Discrete paper equation 11 vs 9 (latter is for continuous SAC), for the discrete case, the Estimate E is rather taken by weighting the -alpha*(log_pi + target_entropy) with the probability of each action by the agent. ( pi), and not by sampling one log_pi.

Shouldn't SAC_Discrete have it's own entropy loss function then?