WOD version of 3-frame waymo config

shawnding commented 1 year ago

Hi @Abyssaledge ,

I noticed some differences in configs of 3-frame and single-frame FSD on waymo. It seems like different versions of WOD dataset is used (because the in_channels is different). I have the following questions

I have re-generated waymo_dbinfo_train.pkl using your code. Should I use newer version of dbinfo to train the 3f model, or I just change the in_channels back to 5? Does it have a big impact on the final result?
What is the tanh_dims and voxel_downsampling_size mean in 3f config? I wonder why there are used in 3f but not in single frame config.
feat_channels changed from 64 to 32. Could you also explain the reason?

Thanks!

Abyssaledge commented 1 year ago

Single-frame model and multi-frame model share the same waymo_dbinfos_train.pkl
tanh_dims normalizes intensity and elongation (3rd and 4th channel). In 1f config, they are also normalized by default.
3f model has much more points, so we reduce the number of channels to 32.

shawnding commented 1 year ago

Thanks! A following question, does the 6th channel represents range frame offset (according to this)? Does it make a difference to the detection performance?

Abyssaledge commented 1 year ago

The 6th channel is the relative timestamp, implemented in LoadPointsFromMultiSweepsWaymo.

tusen-ai / SST

WOD version of 3-frame waymo config #85