udacity / deep-reinforcement-learning

Repo for the Deep Reinforcement Learning Nanodegree program
https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893
MIT License
4.85k stars 2.34k forks source link

About Crawler for Continuous Control #16

Closed ZeratuuLL closed 5 years ago

ZeratuuLL commented 5 years ago

I am trying to solve the Crawler environment in Continuous control task. I have read the Unity webpage and realized that there were two environments. One with static target and one with dynamic target. Which one is provided through the links? Thank you!

chihoxtra commented 5 years ago

I too wanted to know the answer please.

ZeratuuLL commented 5 years ago

After some test it should be the fixed goal....

chihoxtra commented 5 years ago

I think so too. I was training my humble agent and it’s gaiining score and the visual shows that its always facing the same direction. I guess it is a fixed goal. Thanks man!

sam

On 13 Apr 2019, at 2:42 AM, Lifeng Wei notifications@github.com wrote:

After some test it should be the fixed goal....

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/udacity/deep-reinforcement-learning/issues/16#issuecomment-482681626, or mute the thread https://github.com/notifications/unsubscribe-auth/AMvt5Sz9ja7TCTA79SP0BcSpfUGoH_cIks5vgNOpgaJpZM4bJKm-.

ZeratuuLL commented 5 years ago

Np! What kind of results are you getting? The dimension info is different from Unity page so I am not sure if the benchmarks are still reliable.... I can get an average around 1800 but cannot get further....

Lifeng Wei.

From my phone and apology for any typos

On Apr 12, 2019 at 11:44 AM, <Samuel Pun (mailto:notifications@github.com)> wrote:

I think so too. I was training my humble agent and it’s gaiining score and the visual shows that its always facing the same direction. I guess it is a fixed goal. Thanks man!

sam

On 13 Apr 2019, at 2:42 AM, Lifeng Wei notifications@github.com wrote:

After some test it should be the fixed goal....

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/udacity/deep-reinforcement-learning/issues/16#issuecomment-482681626, or mute the thread https://github.com/notifications/unsubscribe-auth/AMvt5Sz9ja7TCTA79SP0BcSpfUGoH_cIks5vgNOpgaJpZM4bJKm-.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub (https://github.com/udacity/deep-reinforcement-learning/issues/16#issuecomment-482682316), or mute the thread (https://github.com/notifications/unsubscribe-auth/AVj1AKuNAhHfjsJ0xpvm-oJ4soxzgyvQks5vgNQmgaJpZM4bJKm-).

chihoxtra commented 5 years ago

envy you! I have just started and I am trying to use PPO to solve the problem. I am only able to get to around 100 scores (average of rewards of pass 100 episodes across agents once any of the agent reached ‘done’). May I ask how many layers of network did you use?

On 13 Apr 2019, at 5:26 AM, Lifeng Wei notifications@github.com wrote:

Np! What kind of results are you getting? The dimension info is different from Unity page so I am not sure if the benchmarks are still reliable.... I can get an average around 1800 but cannot get further....

Lifeng Wei.

From my phone and apology for any typos

On Apr 12, 2019 at 11:44 AM, <Samuel Pun (mailto:notifications@github.com)> wrote:

I think so too. I was training my humble agent and it’s gaiining score and the visual shows that its always facing the same direction. I guess it is a fixed goal. Thanks man!

sam

On 13 Apr 2019, at 2:42 AM, Lifeng Wei notifications@github.com wrote:

After some test it should be the fixed goal....

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/udacity/deep-reinforcement-learning/issues/16#issuecomment-482681626, or mute the thread https://github.com/notifications/unsubscribe-auth/AMvt5Sz9ja7TCTA79SP0BcSpfUGoH_cIks5vgNOpgaJpZM4bJKm-.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub (https://github.com/udacity/deep-reinforcement-learning/issues/16#issuecomment-482682316), or mute the thread (https://github.com/notifications/unsubscribe-auth/AVj1AKuNAhHfjsJ0xpvm-oJ4soxzgyvQks5vgNQmgaJpZM4bJKm-).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/udacity/deep-reinforcement-learning/issues/16#issuecomment-482728556, or mute the thread https://github.com/notifications/unsubscribe-auth/AMvt5UiNFbU-q7GvI0PgoYfGZsrKodrwks5vgPoggaJpZM4bJKm-.

ZeratuuLL commented 5 years ago

You can check my repo here. https://github.com/ZeratuuLL/Reinforcement-Learning/tree/master/Continuous%20Control/Crawler

It's not easy to say the structure directly....