TRAILab / CaDDN

Categorical Depth Distribution Network for Monocular 3D Object Detection (CVPR 2021 Oral)
Apache License 2.0
359 stars 62 forks source link

Cuda Out of Memory with Cam150 image (above 2M image size) #49

Closed namnv78 closed 3 years ago

namnv78 commented 3 years ago

I train CaDDN on A100 with Kitti Dataset it works fluently. But with our private dataset (captured with lidar and camera 150 degree) (after conversion to Kitti), it produces errors: File "/home/ubuntu/CaDDN/CaDDN/pcdet/models/backbones_3d/ffe/depth_ffe.py", line 93, in create_frustum_features frustum_features = depth_probs * image_features RuntimeError: CUDA out of memory. Tried to allocate 5.53 GiB (GPU 0; 39.59 GiB total capacity; 32.48 GiB already allocated; 4.92 GiB free; 32.99 GiB reserved in total by PyTorch). depth probs: torch.Size([2, 1, 80, 302, 480]) image feature shape: torch.Size([2, 64, 1, 302, 480])

Pytorch 1.9.0 cudatoolkit 11.1 torchaudio 0.9.0
torchvision 0.5.0
It is the same error with cudatoolkit 11.0, pytorch 1.7.1 . Thank you very much !!!

codyreading commented 3 years ago

Hi and thanks for the interest!

This is likely due to the fact that your image sizes are too large. If your image feature size is [302, 480], then that means your full image is [1208, 1920], which is much bigger than KITTI images [375, 1242]. I would recommend to downsample images and depth maps to a lower resolution, and make sure you account for this in your 2D bounding box labels as well. You can also reduce your batch size.

namnv78 commented 3 years ago

Thanks for your fast reply !!! So, beside the images, depth map, 2d bounding box lables, do we need to scale calibration matrices and/or PCD as well?

codyreading commented 3 years ago

You don't need to scale the calibration, as all the projection/transformation functionality is all based on normalized coordinates (-1, 1), which works regardless of image scale. By PCD do you mean the LiDAR point clouds? These are not directly used in CaDDN (rather depth maps computed from LiDAR), so they dont need to be scaled as well.

namnv78 commented 3 years ago

Thanks so much for your reply. I still wonder if dimension and location need to be changed as well ?

codyreading commented 3 years ago

Nope that should be fine, all representations in 3D are unchanged. I had to do this for the Waymo Dataset and all that was changed was the above mentioned items.

namnv78 commented 3 years ago

I don't know why, I tried to test with only 1000 samples for 3 classes: Car, Motorbike and Pedestrian but all APs are 0s. Can you please share me some more experiences?

codyreading commented 3 years ago

This could be many things. I would ensure the images/labels for your custom dataset are accurate, and then after that try to visualize the 3D bounding boxes, and then intermediate representations to see which component of the network is not working correctly.