siemanko / tensorflow-deepq

A deep Q learning demonstration using Google Tensorflow
MIT License
1.17k stars 295 forks source link

Links to theory for continuous branch #20

Closed meppe closed 8 years ago

meppe commented 8 years ago

Hi,

A non-technical question, I hope its OK to ask here in github...

I am working on continuous robot control problems and was wondering which approach you are following for the continuous branch. I guess it is the Advantage Actor-Critic (A3C) approach in the 2016 Mnih paper here. However, that method is actually not Q-Learning but a variation of a policy GD method. However, many variables in your controller code suggest that DeepQ learning is applied, so I am a bit confused. Could you confirm that the code tries to reproduce the A3C method in that paper?

siemanko commented 8 years ago

Hi Meppe,

No, this code is NOT A3C. It is http://arxiv.org/abs/1509.02971

I am also working on A3C, but as of today it is work in progress.