woven-planet / l5kit

L5Kit - https://woven.toyota
https://woven-planet.github.io/l5kit
857 stars 278 forks source link

performance of policy trained by UrbanDriver notebook #371

Open tqin opened 2 years ago

tqin commented 2 years ago

Hi, I was playing with the training notebook for UrbanDriver in the examples folder. I set the number of training iterations to 50K and was using the full training set. I didn't change any other params. When I tested the resulting model (training loss appeared converged, around 0.08) in the corresponding test notebook with visualization, I found that the policy appeared converged to a degenerate solution of staying still. This is a bit surprising since the notebook says that "the sheer size of our dataset ensures that a reasonable performance can be obtained even with this simple loop". Am I missing something? What's the right expectation for the sample training notebook?