OpenDriveLab / ViDAR

[CVPR 2024 Highlight] Visual Point Cloud Forecasting
https://arxiv.org/abs/2312.17655
Apache License 2.0
235 stars 15 forks source link

How to reproduce Figure3 in the paper? #34

Closed odroidodroid closed 2 weeks ago

odroidodroid commented 2 weeks ago

Hi, I'm trying to reproduce the figure3 in the paper.

In the paper, did you mean the latent rendering is key to make geometric feature from ray-wise feature?

When I look into the code, only the third BEVFormer encoder layer includes latent rendering.

Why does not last layer include latent rendering?

Also, after latent rendering, the channels still look like ray-wise feature.

I attached the picture of channels after latent rendering layer and end of the encoder layers.

thank you for your great works.

latent_rendering

channel_0 pretrain_channel_0

tomztyang commented 2 weeks ago

Hi, thanks for your attention!

So what feature maps are you visualizing and what checkpoint is used?

Please try to visualize the occ_path_prob_grids at here or visualize occ_path_embed at here.

I just forget which variables are used for visualization. Please give them both a try. Then you should be able to reproduce Fig.3 by the provided pre-trained weights.

Best, Zetong

odroidodroid commented 2 weeks ago

Thank you for your quick response.

The feature maps I visualized are a bev query of the last layer of encoder and a final embedding of latent rendering layer with r101_dcn_fcos3d_pretrain.pth and finetune-ViDAR-RN101-nus-full-1future.pth.

However, I couldn't reproduce the geometric features from occ_path_prob_grids and occ_path_embed, and I still have a question about the structure.

Why didn't you use latent rendering in the last layer but use it in the third layer? Also, why should I reproduce the intermediate embedding, which is not the final embedding in the latent rendering layer? occ_path_embed_0 occ_path_prob_grids_0

These are the one of channel of occ_path_prob_grids and occ_path_embed each, which line you mentioned.

tomztyang commented 2 weeks ago

Ohhh,

please visualize the occ_path_embed at here or occ_path_prob at here.

Use the checkpoint finetune-ViDAR-RN101-nus-full-1future.pth.

Some mistakes on yesterday's update, since I haven't touched this for a long time. Sorry for that.

BTW, it is not to visualize some intermediate embeddings but to visualize the embedding from LatentRendering.

The reason for using it in the third layer is to align our submission in CVPR, some old reasons, not in purpose (At that time, we do ablation on BEVFormer-small, but forget to modify when it comes to BEVFormer-base :-( ). You can try whether it works well when in the 6-th layer, I haven't tried it yet.

Best, Zetong

tomztyang commented 2 weeks ago

I close it for now. Feel free to re-open it for further questions!

Best, Zetong