ikostrikov / walk_in_the_park

MIT License
241 stars 35 forks source link

[Question] Actor updates Q function mean vs min #7

Closed JakobThumm closed 1 year ago

JakobThumm commented 1 year ago

Hi, in https://github.com/ikostrikov/walk_in_the_park/blob/40321ecb3561f7be98d73a2c12337a878b0c18e1/rl/agents/sac/sac_learner.py#L127 you update the actor based on the mean over all Q functions. In the SB3 implementation of SAC, the minimum over all Q functions is used https://github.com/DLR-RM/stable-baselines3/blob/5ef10c8e69b52e1376e6c2c636737d6dd528dda1/stable_baselines3/sac/sac.py#L265 Was this a design decision or are both methods viable? Thanks, Jakob

ikostrikov commented 1 year ago

Hi Jakob,

It depends on a specific task and setup. In some cases, the mean is better while in the others the min is better.