duckietown / gym-duckietown

Self-driving car simulator for the Duckietown universe
http://duckietown.org
Other
51 stars 19 forks source link

Frame Skipping #56

Closed bhairavmehta95 closed 6 years ago

bhairavmehta95 commented 6 years ago

A lot of simulators have implemented frame-skipping, due to the fact that one frame is often not enough time for the action's affect to be noticeable. It's a key parameter in getting RL algorithms to work:

RL with very frequent actions RL algorithms are very sensitive to the frequency of taking actions which is why frame skip technique is usually used on Atari (Mnih et al., 2015). In continuous control domains, the performance goes to zero as the frequency of taking actions goes to infinity, which is caused by two factors: inconsistent exploration and the necessity to bootstrap more times to propagate information about returns backward in time. How to design a sample-efficient RL algorithm which can retain its performance even when the frequency of taking actions goes to infinity? The problem of exploration can be addressed by using parameters noise for exploration (Plappert et al., 2017) and faster information propagation could be achieved by employing multi-step returns. Other approach could be an adaptive and learnable frame skip.

I think we should implement this, and maybe run a vanilla continuous control algorithm (maybe DDPG from what @nithin127 is working on?) and find a "best" one that we can ship as default with the simulator (and expose it, so people can test out what works best for their method).

maximecb commented 6 years ago

Probably a good idea, especially since we're considering running the simulator at 30 FPS instead of 10 FPS.