mit-han-lab / bevfusion

[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
https://bevfusion.mit.edu
Apache License 2.0
2.38k stars 428 forks source link

changing point_cloud_range causes an error #573

Closed atto-js closed 4 months ago

atto-js commented 11 months ago

Hello.

I use secfpn/camera+lidar config. I'm trying to replace point_cloud_range to: [-72.0, -72.0, -11.0, 72.0, 72.0, -3.0] (The original was [-54.0, 54.0, -5.0, -54.0, 54.0, 3.0])

I also modified xbound and ybound accordingly, as follows xbound: [-72.0, 72.0, 0.3] ybound: [-72.0, 72.0, 0.3]

And finally, since the point cloud range has changed from (54+54)(54+54) to (72+72)(72+72), we increased the grid_size as follows heads: object: train_cfg: grid_size: [1920, 1920, 41] test_cfg: grid_size: [1920, 1920, 41]

(original was [1440, 1440, 41])

But when I ran the training, I got RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 240 but got size 180 for tensor number 1 in the list.

My guess is that this is caused by the tensor sizes in the trained model not matching the configs, but how can I fix it?

The only way I found was to adjust the voxel_size in config, but the paper says there is a performance penalty when increasing voxel_size, so I'm looking for other ways.

Thanks.

EpicGilgamesh commented 11 months ago

Hi! I see that your issue is kinda new. When trying to train the nuscenes dataset you encountered an error "no specified protocol"?

gerardmartin2 commented 7 months ago

@atto-js Have you been able to figure out the problem? I am currently trying something similar

junsiknss commented 7 months ago

@gerardmartin2 I just adjusted the voxel_size. Increasing voxel_size was mentioned in the paper as having a slight performance penalty, but I didn't find anything else to do.

gerardmartin2 commented 7 months ago

@junsiknss Okey I see. Thanks for your reply! I will proceed this way then. By the way, is there a big drop in performance in your case? Just to know what to expect

junsiknss commented 7 months ago

@gerardmartin2 For short distances (0~30m), I didn't notice any performance degradation. For longer distances, it was difficult to compare because detection rates were low regardless of voxel_size.

In my opinion, due to the nature of LIDAR, the distance between point and point reflected from a long distant object will increase significantly, so a small increase in voxel_size shouldn't have much impact, but I had no way to verify it. My lidar had a very low density of point clouds ~50m, so I couldn't determine whether this was an effect of voxel_size increasing or a physical limitation of my lidar.

If you own a high-resolution lidar or use a solid state lidar, this would be an interesting experiment to try.

gerardmartin2 commented 7 months ago

@junsiknss Thank you! I have a couple of questions just in case you have already done it or you know something about it.

  1. Have you tried to reduce the FOV?. I would like to consider only the LIDAR points in a 60-70 frontal FOV. By setting point_cloud_range to: [0.0, 54.0, -5.0, -54.0, 54.0, 3.0] (original is [-54.0, 54.0, -5.0, -54.0, 54.0, 3.0]) I guess I am able to pick the points within the 180 frontal degrees, but I assume I cannot do more than that by just modyfing the configs.Any ideas on how to lower even more the FOV?
  2. "grid_size" first two coordinates need to be the same right? I have tried changing grid_size to: [720, 1440, 41] (to match the FOV of 180 degrees) but some errors about tensor shapes appear. I guess the grid needs to be a square, since changing the point_cloud_range to [0.0, 54.0, -5.0, -54.0, 54.0, 3.0] and leaving the original grid_size runs succesfully. I assume then that to compute grid_size I need to take the higher range in point_cloud_range.
junsiknss commented 7 months ago

@gerardmartin2

  1. I don't know how to reduce the horizontal FOV below 180 degrees via config. If the purpose of reducing the FOV is simply to reduce the computation cost, I added a filter for the pointcloud, so that only points within ROI(Region of interest) become inputs of the model for detection.

  2. I did not experience successful execution when adjusting the grid_size too. As I said at the beginning of this thread, I adjusted the voxel_size instead.