autonomousvision / transfuser

[PAMI'23] TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving; [CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
MIT License
1.04k stars 175 forks source link

Question Recreating Dataset #188

Closed st3lzer closed 10 months ago

st3lzer commented 10 months ago

Hello,

currently I am trying to recreate your dataset, regarding that I have some questions:

  1. The routes and scenarios are provided for everything except "right_dataset" and "left_dataset". Are there reasons for this or how are they structured?
  2. Some folder's names contain "clipped", "10m", "30m" etc. Do the provided scenarios and routes result in these data or do I have to clip or cut them manually? In other words: If I run the generation on every route and scenario you provided, will I get the same dataset you provided, except the folders in 1.? (With "the same" I mean with respect to scenario/route and length. Of course they will slightly vary because of non-determinism)
  3. (Not regarding dataset) Is there an overlap in the routes of the Longest6 Benchmark and the training data?

Thank you!

kashyap7x commented 10 months ago
  1. right_dataset and left_dataset are generated from the route xml files for Scenarios 8 and 9. However, we combine these routes with an empty scenario json file. This gives data at left and right turns, but without the added complexity of background traffic running red lights and entering the intersection. In our follow-up projects, we stopped generating these additional folders, which does not significantly impact the driving performance.
  2. The provided routes will result in the same data distribution as our released dataset. The folder names are a result of some naming inconsistencies in preliminary versions of the route and scenario files, and the names shouldn't be relevant for reproducing the dataset.
  3. The training routes are sampled densely (e.g. from all intersections) in a town, so the static environment encountered in the Longest6 evaluation routes is indeed overlapping with the training set at several locations. That said, the model does not see the exact same weather, traffic patterns etc. during training and evaluation on Longest6. For testing generalization to completely unseen routes, we would recommend the benchmark proposed in LAV, which holds out Town02 and Town05 during training.