labyrinth7x / Deep-Cross-Modal-Projection-Learning-for-Image-Text-Matching

Deep Cross-Modal Projection Learning for Image-Text Matching
72 stars 21 forks source link

Why not ResNet #9

Open FangmingZhou opened 4 years ago

FangmingZhou commented 4 years ago

Notice that the results in paper 'Deep Cross-Modal Pojection Learning for Image-Text Matching' are:{top- 1 = 49.37%,top-10 = 79.27%} while the results in this project are {top- 1 = 42.999%,top-10 = 67.869%}, which are resulted from the model that is based on MobileNet. So, why not provide a new version that is based on ResNet! ^^ It will be greatly helpful for our beginners ! Thanks a lot !

wxh001qq commented 4 years ago

Hello, is your results based on CUHK-PEDES?

FangmingZhou commented 4 years ago

Hello, is your results based on CUHK-PEDES?

yes

wxh001qq commented 4 years ago

Hello, is your results based on CUHK-PEDES?

yes

I found whether to use nn.DataParallel() will extremely influence the result. if not use nn.DataParallel() got about {top- 1 = 31%,top-10 = 55%} while use got about {top- 1 = 42%,top-10 = 67%}

FangmingZhou commented 4 years ago

Hello, is your results based on CUHK-PEDES?

yes

I found whether to use nn.DataParallel() will extremely influence the result. if not use nn.DataParallel() got about {top- 1 = 31%,top-10 = 55%} while use got about {top- 1 = 42%,top-10 = 67%}

我没有用过并行的这个方法,网上好像也没有提到这会导致结果不同的?可能你要请教一下别人了