Closed ziqipang closed 11 months ago
Hi @ziqipang,
I appreciate your interest in our project!
The regular data of nuPlan contains the trajectory and positions of all agents in the scene, together with the expert route and traffic-light states. Regardless, the data size is enormous and even expands after unzipping. Therefore, for the complete training, validation, and test set, you need about 2.5TB.
For starting, you can use the nuPlan mini split or only the validation and test split (which might require some adaptations in the data loading). If you want to train ML planners heavily, I am unsure if you can avoid the large training split since there is currently no condensed trajectory dataset publicly available.
However, nuPlan has an internal method to “condense” a training set. Before training, the logs are converted into a feature cache where all input features and targets are precomputed into some custom tensor representation. The system you’re training on (e.g. a computation cluster) doesn’t need to access the complete dataset but only the feature cache, which is usually just a few GB in size. Therefore, you only need one local system (or one hard drive) that contains the complete dataset and use your feature cache everywhere else.
I hope that helps. Feel free to ask further questions!
Hi @DanielDauner
I am really thrilled to receive your detailed reply! What you have mentioned is greatly valuable for me because I cannot get these first-hand experiences just by reading the documents. Thank you so much for your help!
Best,
Ziqi
Hi Dear Authors,
Thank you for the excellent work! I am new to the field of planning and would really like to follow the path of your paper. However, the resource in my lab is quite limited and the >1T data of nuPlan looks really so large.
Therefore, I am curious if such data only contains the trajectory data, or do you have other suggestions for running your algorithms with some condensed trajectory data?
Best,
Ziqi