Cannot Reproduce MinkowskiEngine Benchmark Results

torch-points3d / torch-points3d

Pytorch framework for doing deep learning on point clouds.

https://torch-points3d.readthedocs.io/en/latest/

Other

2.49k stars 391 forks source link

Cannot Reproduce MinkowskiEngine Benchmark Results #750

Open dihuangdh opened 2 years ago

dihuangdh commented 2 years ago

Hi, @CCInc

I am trying to reproduce SpatioTemporalSegmentation benchmark results on ScanNet. The miou of validation set has a large gap with the original repo (<70 vs. 72).

Could you please give some instructions? My running command is

train.py task=segmentation models=segmentation/minkowski_baseline model_name=Res16UNet34C data=segmentation/scannet-sparse training=scannet_benchmark/minkowski

Thanks.

CCInc commented 2 years ago

Hi,

I would verify that the hyperparameters in the training (scannet_benchmark/minkowski) config are the same as those used in the original paper, especially the optimizer and learning rate. Also, you may have better results using a SparseConv3D model instead:

models=segmentation/SparseConv3d

Let me know if you find any difference in parameters!

dihuangdh commented 2 years ago

Yes, glad to see your quick reply.

For MinkowskiNet

Besides the training config, there are other differences between the torch-points3d implementation and the original version. For example, in the data config, I notice the grid size in scannet-sparse is 0.05, while the original repo uses 0.02.

I don't know how these differences harm the final results.

For SparseConv3d

As for the SparseConv3d model, the ScanNet benchmark shows MinkowskiNet performs slightly better than SparseConvNet, so I don't expect to get a better results using SparseConv3d.

However, I will have a try and check whether I can reproduce the original results of SparseConv3d. This may help to locate the reproduction problem of MinkowskiNet (If I can reproduce the original results of SparseConv3d, the dataset part should be ok ? )

Waiting for your verification.

CCInc commented 2 years ago

Hi, If you take a look at the original Minkowski paper (https://arxiv.org/pdf/1904.08755.pdf), we can see that the mIOU on 5cm voxel size they reported is ~67.9.

And, the best result that the original writers of tp3d have achieved was ~65.0 mIOU (https://arxiv.org/pdf/2010.04642.pdf).

Unfortunately, I have no clue why the performance of the original paper can't be replicated in our repo. SparseConv3d and Minkowski should be very similar in performance, they are essentially the same networks. Let me know what iou you are reporting.

CCInc commented 2 years ago

By the way, I am running benchmarks on s3dis, so we should be able to see whether s3dis is close or not to the original paper (somewhere around miou of 65 for 5cm)

dihuangdh commented 2 years ago

Hi, thanks for your clear explanations.

Your answer reminds me that since I directly run tp3d, so the results of my running should be compared with 67.9, not 73.4.

I also find the current version of tp3d gets a little bit better than the original tp3d paper: 66.2 vs. 65.0. Perhaps there are some improvements since the initial tp3d.

CCInc commented 2 years ago

That would make sense, I think there were some improvements made to the model since the last release. In any case, keep me updated on how it goes!

filaPro commented 2 years ago

Hi @dihuangdh ,

Were you able to reproduce 73.4 on ScanNet with voxel size of 2cm? Btw looks like even the author of MinkowskiEngine can not reproduce it with the latest version of MinkowskiEngine following his message here.