hailanyi / VirConv

Virtual Sparse Convolution for Multimodal 3D Object Detection
https://arxiv.org/abs/2303.02314
Apache License 2.0
266 stars 39 forks source link

Faster depth completion? #23

Closed DKuhse closed 1 year ago

DKuhse commented 1 year ago

Hi, great work. I'd like to run a large scale test, but at the moment the PEnet depth completion script is a huge bottleneck, being much slower (like an order of magnitude). Any recommendation for making this step faster while not being too painful?

Edit: I've tried using Point2VoxelGPU3d instead of Point2VoxelCPU3d (changing point_to_voxel to point_to_voxel_hash, as below). While this is a good deal faster (>2x), ~it gives quite different results (mean absolute error of about 10), so I'm not sure if I should use them. I've listed it below, Not sure if I'm using it wrong?~

I forgot that the point order would be different, sorting shows that they're very similar and performance is comparable too. Combined with vectorizing the for loop in la_sampling2, I get a pretty good speed up. Not fully real-time, but good enough for my purposes. Unless you have a recommendation for something even better, I'd consider this solved.

``` points = tv.from_numpy(points.astype('float32')).cuda() voxel_output = self._voxel_generator.point_to_voxel_hash(points) tv_voxels, tv_coordinates, tv_num_points = voxel_output voxels = tv_voxels.cpu().numpy() coordinates = tv_coordinates.cpu().numpy() num_points = tv_num_points.cpu().numpy() ```
hailanyi commented 1 year ago

ok, I see, the sampling is just for save some storage space. You can keep it or remove it. The PENet is also can be replaced by more faster ENet or FusionNet.

Raiden-cn commented 1 year ago

@DKuhse Hi, I am in a similar situation. The voxelization in la_sampling2 take nearly two seconds, which is too painful. Could you share the code about the for loop you modified?