HuguesTHOMAS / KPConv-PyTorch

Kernel Point Convolution implemented in PyTorch
MIT License
738 stars 149 forks source link

Efficiency of a large-scale point cloud prediction #151

Open maosuli opened 2 years ago

maosuli commented 2 years ago

Hi, Thomas!

Thank you for your very complete source codes.

Although I am a layman about 3D semantic segmentation, I tried your source codes and successfully applied KPConv to different datasets, such as S3DIS.

Different from the 2D deep learning prediction process, I found that the workstation had to consume a lot of time to predict one frame of the point cloud, e.g., Area 5.ply.

For example, the prediction of one point cloud may take 2 hours. It is not like a batch prediction process for 2D images, say, around per second for each. In this case, if I want to predict an urban-scale scene, the 3D prediction process may be a high-cost and time-consuming work.

So is the multi-epoch prediction process a common characteristic for the 3D scene prediction? Or do I have any misunderstanding on your test_models.py?

How can I improve the prediction efficiency so that I can produce multiple frames of segmented point cloud more quickly?

I would appreciate it if you could give me some guidance. Thanks

Best regards,

Eric.

HuguesTHOMAS commented 2 years ago

Hi Eric,

You can view the process of predicting Area4 as equivalent to segmenting an image that would have 1000000 x 1000000 pixels. As you can understand, that would take quite some time for an image too.

The basic strategy would be a tiling process, where you split this big image (or point cloud in our case) into smaller tiles and predict each tile. Such a tiling strategy has been used for point cloud segmentation and would be faster than my method, but less accurate because of the border effect (at the border between two tiles, if an object is split in half, the prediction might not be as good).

My strategy consists of predicting small parts (spheres) of the point cloud too, but chosen with some overlap, and repeating that multiple times to get a vote for the best predictions. This is why it is longer.

You can try two easy things to speed up the process without changing my code too much:

  1. Stop the testing when the minimum potential value of 0.1 is reached (meaning every point in the cloud has at least be seen once). Here, you can change the if statement to: if new_min > 0.1: https://github.com/HuguesTHOMAS/KPConv-PyTorch/blob/e600c1667d085aeb5cf89d8dbe5a97aad4270d88/utils/tester.py#L302-L311 And here just make this if statement always true with if int(np.ceil(new_min)) % 1 == 0:: https://github.com/HuguesTHOMAS/KPConv-PyTorch/blob/e600c1667d085aeb5cf89d8dbe5a97aad4270d88/utils/tester.py#L354-L355

  2. Use larger subsampling size and the corresponding input sphere radius (keep the same ratio so that the network still use a similar amount of memory on GPU): https://github.com/HuguesTHOMAS/KPConv-PyTorch/blob/e600c1667d085aeb5cf89d8dbe5a97aad4270d88/train_S3DIS.py#L123-L127 You can try different values, but the higher you get the fewer details the network will have to make predictions and thus you will lose in performance, especially on smaller objects.

maosuli commented 2 years ago

Many thanks for your quick reply.

Yeah, I have tried to set new_min =0.1 as a turning point to output the segmented point cloud. All the parts were predicted but there still existed some obvious errors. The prediction seemed not that good, Maybe the model lacked enough review to improve the prediction. As you said, downscaling the point cloud could also increase efficiency.

I will test the sensitivity for different experimental settings.

Thanks.