qianqianwang68 / caps

MIT License
183 stars 25 forks source link

Reproducing caps from scratch #12

Closed gleefe1995 closed 2 years ago

gleefe1995 commented 2 years ago

output-caps Hi, I tried to reproduce caps from scratch, I cannot reproduce.

I didn't change your config file except --pretrained=0 and trained for 200,000 iterations.

Did I do something wrong?

Thank you,

qianqianwang68 commented 2 years ago

Hi,

Are you using the provided conda environment? If not, you could try to switch to the provided environment, because the batchnorm of newer PyTorch versions have a different initialization strategy, which could hurt training (this issue happened to some other people before).

Also, could you check if the loss decreased during training?

gleefe1995 commented 2 years ago

image image image

1) Yes. I used the provided conda environment, PyTorch=1.0.1.

2) I checked the image and the loss. The image seems like trained well, and total loss and other losses except coarse cycle loss. is this normal? I will check if the coarse cycle loss is trained well.

Thank you.

qianqianwang68 commented 2 years ago

Oh, I see the problem. By setting pretrained=0 I suppose you were trying to train the method from scratch, but this argument is actually controlling if the resnet encoder loads the ImageNet pretrained weights. https://github.com/qianqianwang68/caps/blob/master/config.py#L38 In our experiment, we use ImageNet pretrained weights to initialize the resnet encoder so there is no need to turn it off if you want to reproduce. However, even without ImageNet pretrained weights, your result still doesn't seem to match ours. See https://arxiv.org/pdf/2004.13324.pdf Fig.7 "Ours from scratch". I'm not exactally sure about the cause for this. Maybe you can first try training without pretrained=0, and if there is still an issue, let me know and I can look further into it.

gleefe1995 commented 2 years ago

With pretrained=1, the results are quite reasonable, so maybe there is another problem. I don't know what is the causes for this problem yet, but I will let you know if I find it. Thank you for the answer!