yunshin / SphericalMask

Official implementation of "Spherical Mask: Coarse-to-Fine 3D Point Cloud Instance Segmentation with Spherical Representation"
Apache License 2.0
48 stars 6 forks source link

The training process of SphericalMask. #1

Closed RongkunYang closed 6 months ago

RongkunYang commented 6 months ago

Hello, dear authors, thank you for your nice works! I have tried to train the network, and adopt the ISBNet encoder weight, use the preprocessed data from ISBNet. However, I can not achieve the performance of the mAP 61, the performance of my training result is 0.545 0.715 0.798 for mAP, AP50 and AP25, so may I ask whether there are some training details I ignored?

Thank you!

yunshin commented 6 months ago

Thanks for your interest. What is your training environment (i.e., torch and Cuda version)? My training/testing environment uses 1.12.1 and 11.3 for torch and CUDA, respectively. Also, can you confirm that you can reproduce the testing result (62.6 mAP) when you use the checkpoint (spherical_mask_155.pth)?

yunshin commented 6 months ago

I will also run the training to confirm again and let you know.

RongkunYang commented 6 months ago

OK, thank you for your fast response, I use the same environment as yours.

I notice that the config file has an item "pretrain_decoder", is this item important, does it need to add the isbnet weight?

yunshin commented 6 months ago

Yes, both the "pretrain_encoder" and "pretrain_decoder" have to use the same path currently, as "spherical_mask.yaml". It is nice point. I will change this in the code and commit to avoid confusion.

RongkunYang commented 6 months ago

Oh, I just read the checkpoint loader, the pretrained weight in this repo seems different from the ISBNet pretrain weight, the pretrained weight in this repo have the same parameter size as the Spherical Model, So may I ask how the pretrained weight is trained, is that use the semantic loss, offset loss and box loss to train the backbone like ISBNet?

RongkunYang commented 6 months ago

In the training code, the gradient accumulation is set to 16, does the gradient accumulation affect the performance?

yunshin commented 6 months ago

Thanks for your comment.

1) The pre-trained weight was from the initial release of ISBNet with 2 dynamic convolution layers ( I noticed that the current ISBNet release uses 3 dynamic convolution layers). We stopped ISBNet training when the model reached the best AP around 56 and used it as the pre-trained weight for Spherical Mask. When I uploaded the pre-trained weight, I think I used the Spherical Mask script somewhere to load the weight and saved it by mistake, which is why some modules for Spherical Mask are included. I will remove them and upload the weight again to avoid confusion.  In case you are wondering, we use the pre-trained weight because the training became unstable (exploding gradient), leading the decoder outputs to NaN when initialized without the pre-trained weight. Interestingly, we found a similar issue in other methods using similar U-Net-based backbone architectures.

2) Yes, the gradient accumulation helps boost the performance slightly in our experiments.

I will update the code based on your feedback this weekend!

RongkunYang commented 6 months ago

OK, thank you for enthusiastic answer!

yunshin commented 6 months ago

I just updated the code and confirmed that I could reproduce the result.

RongkunYang commented 6 months ago

Yes, the code is well, and I also reproduce the result. The first time when I train, I make a mistake in the checkpoint loading. Thank you for your enthusiastic answers.

frankkim1108 commented 6 months ago

Hello, @RongkunYang and @yunshin

I read this closed issue and it seems like you have successfully reproduced the experiment. I would like to ask you some questions because I'm having some trouble reproducing.

  1. At the beginning of this issue you said

    However, I can not achieve the performance of the mAP 61, the performance of my training result is 0.545 0.715 0.798 for mAP, AP50 and AP25,

are you talking about performances with only train set or train val set?

  1. Have you tried reproducing on the test set?

  2. I have successfully reproduce the result with the validation set. However, I'm having some trouble reproducing with the test set. Could you kindly give me some advice or some help?

Best regard, Jongwook Kim

RongkunYang commented 6 months ago

Hello, @frankkim1108 , I haven't try to reproduce the test set result, the result we talk above is trained on training set and validate on the validation set. Could you describe detailly about your process of reproducing the result on test set

frankkim1108 commented 6 months ago

Hello, @RongkunYang. Thank you for your quick answer.

We pretrained the train-val benchmark backbone model provided from ISBNet and utilized it to train Spherical Mask with train-val dataset.

The test score result is about 55. (Similar to the ISBNet results)

Dear, @yunshin may I ask whether I have ignored some training details ? I've been trying everything to reproduce your result. Your advice would be very helpful.

Thank you.

RongkunYang commented 6 months ago

OK, the training process seems to have no problem, I will also try to reproduce the test result, and we wait for the author to share more detail about the test process.

peoplelu commented 3 months ago

@frankkim1108 Do you reproduce the result on test set? We also can not reproduce it.