waymo-research / waymo-open-dataset

Waymo Open Dataset
https://www.waymo.com/open
Other
2.66k stars 609 forks source link

Folder structure of scenarios in Waymo Open Dataset #720

Open ShounakRay opened 12 months ago

ShounakRay commented 12 months ago

The scenario folder for the Waymo Motion Dataset looks like this:

image

For the training and validation sets, the website (https://waymo.com/open/data/motion/) says "These segments are further broken into 9 second windows (1 second of history and 8 seconds of future data) with varying overlap." What is meant by history and future data – and does this distinction matter for training?

Furthermore, what are the testing_interactive and validation_interactive folders? How are they different from the testing and validation folders?

Lastly, I notice there's a training_20s folder. Here, I assume each TFrecord file corresponds to a 20 second segment as opposed to a 9 second segment for the TFRecords in training and validation. So how come training_20s, training, and validation each have 1000 TFRecords? I would expect training and validation to have a little more than (20/9) double the number of TFrecords, no?

Thanks for the help!

scott-ettinger commented 12 months ago

Hi, As for history and future data, models are intended to take 1s of history as input and output 8 seconds of future prediction data. As such, the training data is broken into 1 second history, 1 current time step, and 8 seconds of future data.

The interactive dataset splits are for use with the interaction challenge described here.

As for the number of files, each tfrecord file contains many examples (the tfrecord format provides for serial reading of examples from a single file). They are broken into smaller shards for processing in parallel. The training sets consist of 1000 file shards each while the validation and test sets consist of 150 file shards each. Again each of the file shards contains many examples - there are hundreds of thousands of total examples.

Please let me know if you have further questions.