Closed weiyunfei closed 4 years ago
Hi,
Thanks for your interest in our work! I think it is because you are using a different environment setting with our configurations.
The code is tested using Pytorch 0.4 and python 2.7, as well as others libs stated in the readme file and requirement.txt. It is not written in python 3. I suggest you to try the correct environment setting.
Hi,
Thanks for your interest in our work! I think it is because you are using a different environment setting with our configurations.
The code is tested using Pytorch 0.4 and python 2.7, as well as others libs stated in the readme file and requirement.txt. It is not written in python 3. I suggest you to try the correct environment setting.
Thanks for your reply! I'll try it today. By the way, could you tell me which version of cuda you used?
I have fixed it. There is something wrong when I trans the code to python3 style. Thanks for your reply again. I will close this issue.
Hi, Kunpeng. Sorry for bother again. I got some problems when I try to train a model on the Flickr30K. I'm confused whether I should use the same setting as that on MS-COCO. I used the same set, while I've not get a good result as yours. Could you tell me the settings when you trained a model on Flickr30K dataset, e.g. num of epochs, lr, batch size? I wanna use the same settings as yours to reproduce your result.
I have fixed it. There is something wrong when I trans the code to python3 style. Thanks for your reply again. I will close this issue.
@weiyunfei Hi Yunfei. I also encounter this problem that the performance is lower than those given in the paper. I guess there is something wrong with my modified python3 style code. Would you please offer more details about your problem?
I have fixed it. There is something wrong when I trans the code to python3 style. Thanks for your reply again. I will close this issue.
@weiyunfei Hi Yunfei. I also encounter this problem that the performance is lower than those given in the paper. I guess there is something wrong with my modified python3 style code. Would you please offer more details about your problem?
Well, in fact I have not found out the reason. My final environment is python2.7 and pytorch1.2, and I found that only the python version has some affect to it. Therefore, I infer that there may be some differences in the devide operation between py2 and py3, but I have not found where it is. Perhaps you could install an environment like mine. Here is my conda enviroment yaml, you can import it directly. https://drive.google.com/file/d/1hSY9M2MgQK95pw0TPFQi6DG3cXyLyG-v/view?usp=sharing
I have fixed it. There is something wrong when I trans the code to python3 style. Thanks for your reply again. I will close this issue.
@weiyunfei Sorry to disturb you, what is your final re-produce results on the MS-COCO dataset? My test results are image-to-text R@1 68.5;text-to-image R@1 57.5. But, single model results from the author (model_coco_1.pth.tar image-to-text R @ 1 74.0; text -to-image R @ 1 60.8; rsum 509.4 ..... model_coco_2.pth.tar image-to-text R @ 1 73.6; text-to-image R @ 1 60.7; rsum 508.3) There is still a large gap, I need to adjust the parameters of the namespace? Or increase the number of training epochs? ? I have already used the same environment as the author, such as python2 pytorch0.4.1 and so on.
@KunpengLi1994
I have fixed it. There is something wrong when I trans the code to python3 style. Thanks for your reply again. I will close this issue.
@weiyunfei Sorry to disturb you, what is your final re-produce results on the MS-COCO dataset? My test results are image-to-text R@1 68.5;text-to-image R@1 57.5. But, single model results from the author (model_coco_1.pth.tar image-to-text R @ 1 74.0; text -to-image R @ 1 60.8; rsum 509.4 ..... model_coco_2.pth.tar image-to-text R @ 1 73.6; text-to-image R @ 1 60.7; rsum 508.3) There is still a large gap, I need to adjust the parameters of the namespace? Or increase the number of training epochs? ? I have already used the same environment as the author, such as python2 pytorch0.4.1 and so on.
@KunpengLi1994
Hello, I get the results stated in Kunpeng's paper with the pretrained model he provided. I propose you should check the code and environment carefully. Your results looks similar as the results when I used py3. Or you may try with the environment I provided in this issue.
I have fixed it. There is something wrong when I trans the code to python3 style. Thanks for your reply again. I will close this issue.
@weiyunfei Sorry to disturb you, what is your final re-produce results on the MS-COCO dataset? My test results are image-to-text R@1 68.5;text-to-image R@1 57.5. But, single model results from the author (model_coco_1.pth.tar image-to-text R @ 1 74.0; text -to-image R @ 1 60.8; rsum 509.4 ..... model_coco_2.pth.tar image-to-text R @ 1 73.6; text-to-image R @ 1 60.7; rsum 508.3) There is still a large gap, I need to adjust the parameters of the namespace? Or increase the number of training epochs? ? I have already used the same environment as the author, such as python2 pytorch0.4.1 and so on. @KunpengLi1994
Hello, I get the results stated in Kunpeng's paper with the pretrained model he provided. I propose you should check the code and environment carefully. Your results looks similar as the results when I used py3. Or you may try with the environment I provided in this issue.
thank you very much, I will try it.
I have fixed it. There is something wrong when I trans the code to python3 style. Thanks for your reply again. I will close this issue.
@weiyunfei Hi Yunfei. I also encounter this problem that the performance is lower than those given in the paper. I guess there is something wrong with my modified python3 style code. Would you please offer more details about your problem?
Well, in fact I have not found out the reason. My final environment is python2.7 and pytorch1.2, and I found that only the python version has some affect to it. Therefore, I infer that there may be some differences in the devide operation between py2 and py3, but I have not found where it is. Perhaps you could install an environment like mine. Here is my conda enviroment yaml, you can import it directly. https://drive.google.com/file/d/1hSY9M2MgQK95pw0TPFQi6DG3cXyLyG-v/view?usp=sharing
Thanks a lot. I have obtained the same results in the paper with the pretrained model. By the way, the version of my pytorch is 1.0.1 and python is 2.7. Therefore, the previous poor results are caused by my python version(3.5).
Hi, Kunpeng. Sorry for bother again. I got some problems when I try to train a model on the Flickr30K. I'm confused whether I should use the same setting as that on MS-COCO. I used the same set, while I've not get a good result as yours. Could you tell me the settings when you trained a model on Flickr30K dataset, e.g. num of epochs, lr, batch size? I wanna use the same settings as yours to reproduce your result.
@weiyunfei Hi Yunfei. I also encounter this problem that the performance on the Flickr30K is much lower than those given in the paper (61 rank1 for text retrieval and 48rank1 for image retrieval). Did you get a similiar results as mine? Have you ever solved this problem?
Hi, Kunpeng. Sorry for bother again. I got some problems when I try to train a model on the Flickr30K. I'm confused whether I should use the same setting as that on MS-COCO. I used the same set, while I've not get a good result as yours. Could you tell me the settings when you trained a model on Flickr30K dataset, e.g. num of epochs, lr, batch size? I wanna use the same settings as yours to reproduce your result.
Hi Yunfei,
Sorry for the late reply duo to my due on other projects. Actually, we re-organize our code when preparing the camera-ready version and only re-train models at that time on the MSCOCO where we mainly focus on.
For training on Flickr30K, we usually use a Batch Normalization layer (refer to this line) to make the training more stable on this small dataset. I have updated the "model.py" code and provide the new pretrained models at here. It should achieve 71.5 for R@1 of i2t(Image to text) and 54.8 for R@1 of t2i(Text to image).
Hi Yunfei,
Sorry for the late reply duo to my due on other projects. Actually, we re-organize our code when preparing the camera-ready version and only re-train models at that time on the MSCOCO where we mainly focus on.
For training on Flickr30K, we usually use a Batch Normalization layer (refer to this line) to make the training more stable on this small dataset. I have updated the "model.py" code and provide the new pretrained models at here. It should achieve 71.5 for R@1 of i2t(Image to text) and 54.8 for R@1 of t2i(Text to image).
Hi, Kunpeng. Thanks for your reply. I will try this new code again.
Hi, Kunpeng. Appreciate for your excellent paper and code. But I have some trouble confusing me when I tried to evaluation models trained with your code. I have trained two models with your code, and I found the my results could not achieve that is stated in your paper. Moreover, I found that the pretrained models you provided can not achieve the results in the paper on MS-COCO either. And in my environment, pytorch=1.3 & python-3.7 are installed. I have compared my training log and yours and found that they are particularly similar, so I don't know what's wrong with my code. The evaluation result on MS-COCO 1K test set is as follow. Would you please give me some proposal? I'll greatly appreciate for that.