Closed jchengai closed 1 year ago
Hi @jchengai,
We will share that in our documentation in the next release. In the mean time you can find the implementation here
Hi @patk-motional,
Is the provided model code and configuration the same ones used in the reported UrbanDriver baseline, or are there other changes? I tried training UrbanDriver with ~250K samples and the performance was lower than the IDM policy for closed-loop reactive planning, but I'm not sure if the performance disparity is solely from the dataset size.
If the details aren't available until the next release, is there an ETA for when that might be ready?
Thanks!
Hi @bhyang,
Let me connect you with @christopher-motional who implemented and train the baseline model. He is on leave at the moment. I'll get him to reply as soon as he is back next week.
Hi @bhyang, sorry for the delayed response. Yes, the reported baseline was trained using the available model code with close to the same configuration you will find in the available config files. I believe the only deviations were using the AdamW optimizer with a slightly different learning rate from default (I believe 1.25e-5 vs 5e-5) along with the OneCycleLr learning rate scheduler. Data augmentation was an important part of this, but that should be the same as what you see in the training config.
The baseline was trained on the full trainval dataset, subsampled at a rate of 0.1 (around 300K samples I believe). For this baseline, the IDM policy did actually generally slightly outperform the ml model when evaluated in closed loop with reactive agents -- depending on how much of a disparity you're seeing, that is somewhat expected.
Hi @christopher-motional, thanks for the clarification! I have a few follow-up questions:
scenario_filter=all_scenarios
and scenario_filter.num_scenarios_per_type=2
(and +simulation=closed_loop_reactive_agents
)timestamp_threshold_s
) when dumping the dataset? For me setting scenario_filter.limit_total_scenarios=0.1
results in around 2 million samples stillAppreciate the help, thanks!
scenario_builder=nuplan_challenge
and scenario_filter=nuplan_challenge_scenarios
. Only 2 scenarios per type is quite small and I think would involve too much variance for effective evaluation. I would suggest bumping this up depending on how long you're willing to wait for simulationscenario_filter.limit_total_scenarios=0.1
was set. Sorry, was thinking that referred to the roughly 3 million scenarios being culled down to 300K, but that actually might include UNKNOWN types as well so that might actually be the accurate number. In any case, it was trained on the full trainval set with setting scenario_filter.limit_total_scenarios=0.1
Just as a quick follow up, as I was saying, the values your see reported for the warm-up phase is reflective of the fact our evaluation for this phase was done on a smaller subset of data with reduced number of scenario types. Evaluation for the test-phase will be on a larger amount of data and will not be skewed in this manner.
@christopher-motional What was the effective batch size used? Also how long did training take approximately (both number of epochs and wall clock time)? Thanks!
The effective batch size was 256 and we trained around 50 epochs taking around 2 days from what I remember. For what it's worth, the baseline really is more of a reference point to get people started and serves as a base comparison point. If you look at how feature extraction/data augmentation is done for this model in the devkit, you should see a number of things that could be done more efficiently, which we encourage competitors to improve to effectively train their models.
I see that in the tutorial, 'scenario_filter.limit_total_scenarios=500'.
Thanks in advance, reading through the Issues discussion has been very helpful!
Hi @rossgreer,
Answering your questions in the same order:
. Answering your question from Stackoverflow.
training_vector_model.yaml` does exist. https://github.com/motional/nuplan-devkit/blob/ce3c323af01c0d7ec5672f7832ef53f9c679aab0/nuplan/planning/script/experiments/training/training_vector_model.yaml
Hi motional teams, I wonder if you could revel the implementation details of the baseline (UrbanDriver) in the leaderboard, e.g., model architecture, training config, dataset split/augmentation.