haosulab / ManiSkill2-Learn

Apache License 2.0
80 stars 16 forks source link

Visual backbone is shared between critic and target critic #29

Closed ErikKrauter closed 3 months ago

ErikKrauter commented 8 months ago

Hello,

I noticed that in the implementation of SAC, when using shared_backbone=True the visual_nn of the actor is used in the critic_cfg. I understand the intention of sharing the visual backbone to learn shared representations and maybe increase sample efficiency by reducing the total number of learned parameters. However, the critic_cfg is then used in this line to build the target_critic. As in python dictionaries are passed by reference, any changes made inside the function to the dictionary will carry on outside the function scope. This means that the target_critic now also shares its backbone with the actor and the critic. This seems odd to me, as I thought the target_critic is meant to be 'decoupled' from the critic. Is it intentional that the target_critic shares the visual backbone with the actor and the critic?

Best regards, Erik

xuanlinli17 commented 8 months ago

Hi Erik,

Thanks for your detailed checking of our code! The target critic is intentionally designed to share the backbone; otherwise the training is unstable.