nav_a3c can't output the right rewards without D1 and D2?

zeus7777777 / nav_a3c

14 stars 10 forks source link

nav_a3c can't output the right rewards without D1 and D2? #7

Closed fanyuzeng closed 4 years ago

fanyuzeng commented 5 years ago

I modify your code and delete the D1 and D2, which is the 3th architecture in Figure 2 of the original paper. However, the rewards that nav_a3c without D1D2 outputs are very low and the reward curve is abnormal . But nav_a3c without D1D2 can get proper reward curve in Figure 3 of "learning to navigate in complex environments",
How to solve such issue? Best regards to you.

zeus7777777 commented 5 years ago

Maybe you can try to search hyperparameters in config.py, as original paper did. In my experience, such settings could affect reward curve.

fanyuzeng commented 5 years ago

ok, I try it now.