VinF / deer

DEEp Reinforcement learning framework
Other
485 stars 126 forks source link

DDPG implementation #65

Open dynamik1703 opened 6 years ago

dynamik1703 commented 6 years ago

Hey VinF,

thanks for your work!

I have questions about the DDPG implementation in deer.

Patrick Emami recommends in http://pemami4911.github.io/blog/2016/08/21/ddpg-rl.html to use for the actor and critic two functions in separate classes.

Additionally, he adds the action tensor in the 2nd hidden layer of the Critic Network.

Is my assumption correct that the ddpg implementation in deer is different?

King regards, Roman

VinF commented 6 years ago

Hi Roman,

In deer, the actor and the critic are two different objects (two different neural networks): the actor is instanciated here and the actor here. But you are right, they indeed come from the same class : the way it's implemented uses the arguments of the class to make the two objects different (e.g., either the action is provided as input or not). Of course, it would be possible to build two different classes instead.

The place where the actions are given as input in the NN depends on the architecture you decide for your neural network. So indeed you can decide to provide them deeper in the NN. It's a design architecture but it does not really change anything in the logic of the algorithm.

Best, Vincent

dynamik1703 commented 6 years ago

That's true.

In the ddpg paper the author used furthermore different learning rates for the actor and critic. The final layer for the actor is a 'tanh'.

Do you think it is worth to add this to deer?

VinF commented 6 years ago

Yes, that may possibly be easier for the end user. So we would have two classes: a NN_actor and a NN_critic. For the Q-network, we would then use the NN_critic class. If you want to suggest a PR, we can look into it.

dynamik1703 commented 6 years ago

I will work out a proposal.

Best, Roman