andompesta / ppo2

Pytorch implementation of PPO2
16 stars 2 forks source link

Why I can't find actor-critic structure in this code? #1

Open YuanBoXie opened 2 years ago

YuanBoXie commented 2 years ago

As the title says: PPO and PPO2 algorithm both have the actor-critic structure, but this code I can't find that. Does this really implement the PPO2 algorithm?

YuanBoXie commented 2 years ago

Sry,Is the function policy_head means the actor and the function value_head means the critic?

YuanBoXie commented 2 years ago

@andompesta If I want to add penalty to the critic network, how can I change the code?

andompesta commented 2 years ago

the entire network structure is present in the model file. I'm not sure what you mean by "add penalty to the critic network", I guess you want to modify the loss of the critic network.