pmj110119 / RenderOcc

[ICRA 2024] RenderOcc: Vision-Centric 3D Occupancy Prediction with 2D Rendering Supervision. (Early version: UniOcc)
447 stars 25 forks source link

May you provide dataset pipelines, i.e. how to get the ray(gt_depth, gt_semantics, ray_o, ray_d) from the input multi-frame images. #3

Closed synsin0 closed 1 year ago

synsin0 commented 1 year ago

Thanks for your great work. May you share an early version of dataset pipelines? Thanks, I can't wait to see RenderOcc works.

pmj110119 commented 1 year ago

Thank you for your advice! And I have pushed a version that includes the dataset code and an config example to the 'tmp' branch.

Please note that this version is currently undergoing refactoring, and I'm not certain if there are some minor bugs and I haven't had a chance to write the documentation yet.

I will also need a few days to perform validation after the Chinese National Day holiday. I apologize for any inconvenience.

pmj110119 commented 1 year ago

Thanks for your great work. May you share an early version of dataset pipelines? Thanks, I can't wait to see RenderOcc works.

And you can refer to the code here to generate rays from multiple frames: https://github.com/pmj110119/RenderOcc/blob/bd8bb8572fa1a2480f44f98183708a5b6d84664e/mmdet3d/datasets/nuscenes_dataset_occ.py#L162

synsin0 commented 1 year ago

Thanks again! After a little effort I started training the given config, which is 2 days for 12 epochs on 8xA6000. The data_time is around 0.2-0.3. The training process cost longer than original bevstereo-occ.

pmj110119 commented 1 year ago

I've released an update that reduces training time to just 25% of the original. Please give the updated version a try ~

synsin0 commented 1 year ago

I replaced all modules with the new version, but the training time is almost the same with the original at the beginning( may be faster in later epochs?). May you take a look at my training log to see where's going wrong? 20231005_224106.log

pmj110119 commented 1 year ago

I replaced all modules with the new version, but the training time is almost the same with the original at the beginning( may be faster in later epochs?). May you take a look at my training log to see where's going wrong? 20231005_224106.log

RenderOcc takes longer time per iter compared to bevstereo-occ due to the additional rendering. It's about 40% slower (on my A100 machine, one iter of RenderOcc takes 3s with Batch Size set to 1, while bevstereo-occ takes 2.2s).

On my own machine, it used to take 15s per iter, which might have been due to uneven server CPU load of my machine. It seems that you don't have this issue on your machine.

Fortunately, training RenderOcc requires significantly fewer epochs. After fixing some bugs, example config can achieve 24+ mIoU with only 6 epochs of training (verified)."