timoklein / redo

ReDo: The Dormant Neuron Phenomenon in Deep Reinforcement Learning (pytorch)
25 stars 3 forks source link

SAC and DrQ #9

Closed zichunxx closed 1 month ago

zichunxx commented 1 month ago

Hi @timoklein! Thanks for your generous sharing!

I think your repo is more readable than the official implementation for Pytorch users like me.

I tried to implement redo with SAC and DrQ. Should redo be applied to actor, critic, and encoder networks at the same time?

Also, that would be great if you consider updating the training scripts with SAC and DrQ.

Thanks!

timoklein commented 1 month ago

Hi Zichun,

thanks for the warm words.

Also, that would be great if you consider updating the training scripts with SAC and DrQ.

Sadly, I don't think I will have the time to do that, I'm pretty busy right now.

I tried to implement redo with SAC and DrQ. Should redo be applied to actor, critic, and encoder networks at the same time?

I would assume that ReDo should be applied to actor, critic, and encoder at the same time. There is some research suggesting that applying it to the critic is most important. As the linear layers in the network head have by far the most parameters, this is also where dormant neurons will likely be concentrated (not in the encoder).

If you want to share an encoder for SAC, make sure to read Appendix B.2 of the DrQ paper or the SAC+AE paper. Naively sharing an encoder for SAC actor and critic doesn't work.

zichunxx commented 1 month ago

Thanks for your kind reply. I'll first apply redo to the critic network and watch the effect.

timoklein commented 1 month ago

Great, good luck! I will close this issue for now. If you have more questions, feel free to re-open it :)