HuguesTHOMAS / KPConv-PyTorch

Kernel Point Convolution implemented in PyTorch
MIT License
789 stars 158 forks source link

Should calibration be re-computed for inference on new data? #253

Open floriandeboissieu opened 2 months ago

floriandeboissieu commented 2 months ago

Hi Thomas Hugues, many thanks for this pytorch version of KPConv.

Applying a trained model on new data with different point density using test_model.py recomputes the calibration if the batch_limits.pkl and neighbor_limits.pkl are not found in the data directory. I am not sure if that is what is expected, especially for neighbor_limits.

Would a different the neighbor limit on the first layer have an influence on the result ?

Or should I keep them (at least the neighbor limits, i.e. neighbor_limits.pkl) the same as for training?

After reading carefully the code and the issues I wasn't able to have a clear answer about that.

HuguesTHOMAS commented 2 months ago

Hi @floriandeboissieu,

Good question. So first, batch_limits has no impact on the results. I actually usually set the batch limit manually to 1 to force single point cloud inference. When it comes to neighbor limits, this technically has an impact, as it changes the operations performed during network inference. However, the impact should be minimal, as neighbor limits are only applied in a few edge cases neighborhoods. You can try and compare between keeping the training limits or updating them, that should not make much difference. However, note that the point density is important information that the network probably uses to understand the geometry, so training and testing on datasets with very different point densities might not be a very good idea and should lead to poor performance.

floriandeboissieu commented 2 months ago

Thanks for your answer. My case test case mentioned in that question was actually with a lower density (10x lower) and distributed differently than in the training data. I tested both with a new and the testing .pkl but obtained similar results as mentioned, with poor performance.

The density depending much the distance to the scanner, it is a case that we can often find in the reality. So that leads me to another question (sorry, but actually it may be linked to the fixed neighbor_limit if I am right, having a smaller distance influence in dense areas than in sparse): Have you thought about changing (actually reducing) the density in the augmentation process and/or would that make sense, according to your experience, in order to make the segmentation more robust to changes in the density? (even though there are already different mechanism in kpconv for that, e.g. subsampling with the barycenter, etc.).

Thanks again for your prompt reply.

HuguesTHOMAS commented 2 months ago

My case test case mentioned in that question was actually with a lower density (10x lower) and distributed differently than in the training data. I tested both with a new and the testing .pkl but obtained similar results as mentioned, with poor performance.

This sounds logical.

Have you thought about changing (actually reducing) the density in the augmentation process and/or would that make sense

This would totally make sense. I did not really test it as I did not have datasets with different training and test densities, but I think this should work. If you can simulate the density value according to distance to a chosen center point, and use that as augmentation that should be very good. You can imagine each point having a probability to be dropped and the prob is chosen to match the density value depending on distance

floriandeboissieu commented 2 months ago

Thanks again for your time and guidance, I'll have try and let you know if it has an impact or not.

By the way, I read your last publication on kpconvx, but the link https://github.com/apple/ml-kpconvx is not anymore valid. Is there a place where I could find this repo to have a look to that new version of kpconv ?

HuguesTHOMAS commented 1 week ago

Hi @floriandeboissieu, The code of KPConvX is now out if you want to try it.