Open YuanBoXie opened 2 years ago
Sry,Is the function policy_head
means the actor and the function value_head
means the critic?
@andompesta If I want to add penalty to the critic network, how can I change the code?
the entire network structure is present in the model file. I'm not sure what you mean by "add penalty to the critic network", I guess you want to modify the loss of the critic network.
As the title says: PPO and PPO2 algorithm both have the actor-critic structure, but this code I can't find that. Does this really implement the PPO2 algorithm?