Question about umber of episodes and fix one mistake in code!

mit-acl / rl_collision_avoidance

Training code for GA3C-CADRL algorithm (collision avoidance with deep RL)

117 stars 28 forks source link

Question about umber of episodes and fix one mistake in code! #20

Closed kmzdaniel closed 1 year ago

kmzdaniel commented 1 year ago

Hello, Mr. Everett. in line 356, a negative(behind "dist_btwn_nearest_agent[i]/2.") must be turned into a positive. and the question is: should I use more episodes if I train GA3C in a "single agent" way? (For example, for phase one, over 1,500,000 to four agents.)

mfe7 commented 1 year ago

Thanks for pointing this out. In terms of numbers of episodes, the best advice I can give is to experiment with different numbers and see at what point the reward curve seems to converge to a reasonable answer. I don't know of a more principled answer to specifying the number episodes a priori

kmzdaniel commented 1 year ago

Many Thank for your guidance.