Cannot reproduce the results reported in the Paper (CD=2.723)

AlphaPav commented 4 years ago

You need to train the whole network with Chamfer Distance. It reaches CD ~0.40 on ShapeNet. Then, you need to fine-tune the network with Gridding Loss + Chamfer Distance on the Coarse Point Cloud. Finally, you fine-tune the network with Chamfer Distance. Chamfer Distance is taken as a metric, therefore, you cannot get lower CD without using Chamfer Distance as a loss.

Originally posted by @hzxie in https://github.com/hzxie/GRNet/issues/3#issuecomment-656446671

hzxie commented 4 years ago

So, what's the problem you are facing now? Please provide more details.

AlphaPav commented 4 years ago

Hi author, thanks for the amazing work.

On your released pre-trained model, I can get 0.7082 F-score, 2.722 CD. However, when I train from scratch, I had some problems listed below:

"You need to train the whole network with Chamfer Distance." --- It reaches 4.588 CD, 0.6133 F-score, which is similar with Table 7&Not Used&CD&Complete = 4.460 in your paper.

"Then .. fine-tune the network with Gridding Loss + Chamfer Distance on the Coarse Point Cloud." ---- It reaches 4.536 CD, 0.6255 F-score. It was supposed to be about ~2.7, right?

sparse_loss = chamfer_dist(sparse_ptcloud, data['gtcloud']) dense_loss = chamfer_dist(dense_ptcloud, data['gtcloud']) grid_loss = gridding_loss(sparse_ptcloud, data['gtcloud']) _loss = sparse_loss + dense_loss + grid_loss

C.NETWORK.GRIDDING_LOSS_SCALES = [128] C.NETWORK.GRIDDING_LOSS_ALPHAS = [0.1]

"Finally, you fine-tune the network with Chamfer Distance." --- the CD didn't decrease below 4.536.

I'm wondering what steps am I making mistakes? (like learning rate/loss weight of gridding loss)

AlphaPav commented 4 years ago

your processed ShapeNet dataset has 28974 training data samples while the PCN dataset has 231792 training data samples

is it because your provided dataset is not completed?

hzxie commented 4 years ago

@AlphaPav Sorry for the late reply. I don't have time to check this issue these days. But I'm sure that there is nothing wrong with the released dataset. 231792 / 28974 = 8, which indicates that there are 8 partial input point cloud for each model in ShapeNet.

AlphaPav commented 4 years ago

@AlphaPav Sorry for the late reply. I don't have time to check this issue these days. But I'm sure that there is nothing wrong with the released dataset. 231792 / 28974 = 8, which indicates that there are 8 partial input point cloud for each model in ShapeNet.

The PCN dataset is about 48 GB, while the released dataset is about 10 GB. Do you mean that you randomly augment each point cloud 8 times during training?

hzxie commented 4 years ago

No. I think the difference may be caused by different compression ratios. You can also generate the ShapeNet dataset from PCN with this script.

SarahChane98 commented 4 years ago

Hi! I also cannot reproduce the results. The highest CD I got after training three times was 5.2. May I know how many epochs you've trained for each round respectively? (i.e. CD only, CD + gridding loss, CD only)

hzxie commented 4 years ago

@SarahChane98 I cannot report the exact numbers of epochs for each round. For each round, I train several times until the loss does not decrease. Try to fine-tune the network again with the previous weights (from last training).

paulwong16 commented 4 years ago

Hi there, I just tested your pretrained model on test set, and the result is close to the value reported in paper. However, when I tested on validation dataset, it reported a dense CD around 7.177. I was wondering why there is a hugh gap between CDs on val set and test set?

and a dense cd around 5.087 for training set reported with pretrained model (should be the same as training dense loss if i understand correctly)

hzxie commented 4 years ago

@paulwong16 If the reported results are correct, one possible reason why the pretrained model performs worse in the validation and training set is that we choose the best model for the test set instead of the validation set and training set.

paulwong16 commented 4 years ago

@paulwong16

Because we choose the best model for the test set instead of the validation set.

but why CD on test set could be even much lower than on training set?

hzxie commented 4 years ago

@paulwong16 Because the pretrained model is best for fitting distribution of the testing set. Instead, the distribution of the training and validation set may be different from the testing set.

paulwong16 commented 4 years ago

@paulwong16

Because the pretrained model is best for fitting distribution of the testing set.

Instead, the distribution of the training and validation set may be different from the testing set.

well...i believe the best model should not be chosen according to the test result (instead, should be the validation result). And from the best results I could reproduce, the training loss was a little lower than val loss and test loss, and test loss was close to the val loss.

Anyway, thanks for your kind reply, I will try to reproduce the result.

hzxie commented 4 years ago

@paulwong16 Yes, choosing models from the testing set is not a good option. For the Completion 3D benchmark, the best model is chosen from the validation set. (Because we don't have the ground truth for the testing set.)

wangyida commented 3 years ago

@hzxie Hi I'm wondering how you incoorporate gridding_loss in training? I have not found it in the script Thanks

hzxie commented 3 years ago

@wangyida

You can use the Gridding Loss here:

https://github.com/hzxie/GRNet/blob/335259235804fa30b0e89e0e257d886687bfb6f3/core/train.py#L113

when fine-tuning the network.

wangyida commented 3 years ago

@hzxie Thank you, I tried it out and the result seems to be fitting with the expected trends. Thanks for your inspiring work;)

Lillian9707 commented 3 years ago

Hi, I'm wondering how to fine-tune the network with the previous weight? I've tried the same configuration as your paper but the best model gets CD=4.538 and F-Score=6.206 while your pre-trained model can get CD=2.723 and F-Score=7.082.

And I check the log and find that the network had converged to the optimal in 20 epochs. Why you set 150 epoch as the default?

hzxie commented 3 years ago

@Lillian9707

In my experiments, the loss will continue to decrease after 20 epochs. Moreover, you need to fine-tune the network with the Gridding Loss.

Lillian9707 commented 3 years ago

Hi, I still cannot reproduce the result. Can you provide more details?

I've tried to fine-tune the framework with gridding loss and lower learning rate. But the CD score and F-score got worse.

hzxie commented 3 years ago

@Lillian9707 Keep the learning rate unchanged during fine-tuning. According to the experimental results of AlphaPav, the CD and F-Score got better after applying Gridding Loss.

Lillian9707 commented 3 years ago

Thank you for your reply! But AlphaPav only gets '4.536 CD, 0.6255 F-score' after fine-tuning, which looks more stochastic. So the fine-tuning process is to train with 1 CD on both Sparse and Dense Point Cloud + 1 Gridding Loss on Sparse Point Cloud? And the learning rate is always 5e-5?

Lillian9707 commented 3 years ago

hi, sorry to bother you. I still cannot reproduce the results in the paper. I have tried several times to fine-tune the network, including use lr=5e-5, 1e-5, 1e-6 and Multi-stepLR, training with CD + Gridding loss on sparse or dense cloud, and so on. But the results are always around CD=4.5 and F-Score=6.2. Can you provide more details about fine-tuning?

hzxie commented 3 years ago

@Lillian9707

Try to fine-tune the network w/ and w/o Gridding Loss several times. During fine-tuning, try to use top-10 (not always the best) weights from the previous training should be loaded. The init learning rate for fine-tuning should be 1e-4.

hzxie / GRNet

Cannot reproduce the results reported in the Paper (CD=2.723) #8