cshizhe / hgr_v2t

Code accompanying the paper "Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning".
MIT License
209 stars 21 forks source link

different time get different scores #9

Closed zqlearning closed 3 years ago

zqlearning commented 4 years ago

Hi, cshizhe, thanks for your great work. when testing performance on MSRVTT dataset, I found that the performance in different test are same, but the sent_scores, verb_scores and noun_scores were different. I don't know why.

there are some outputs in different test : ....... tensor(-197.5491, device='cuda:0') tensor(4066.6943, device='cuda:0') tensor(4957.7461, device='cuda:0') tensor(-172.1141, device='cuda:0') tensor(4193.5151, device='cuda:0') tensor(5157.7603, device='cuda:0') tensor(-68.0737, device='cuda:0') tensor(1171.2622, device='cuda:0') tensor(1342.9297, device='cuda:0') tensor(82.5919, device='cuda:0') tensor(4531.4185, device='cuda:0') tensor(5212.8369, device='cuda:0') tensor(-43.9712, device='cuda:0') tensor(4319.0312, device='cuda:0') tensor(5150.5146, device='cuda:0') tensor(1.5257, device='cuda:0') tensor(4386.4746, device='cuda:0') tensor(5333.5151, device='cuda:0') tensor(-22.8292, device='cuda:0') tensor(1247.3308, device='cuda:0') tensor(1393.1257, device='cuda:0') tensor(23.0804, device='cuda:0') tensor(1473.0065, device='cuda:0') tensor(1647.1292, device='cuda:0') tensor(-31.6811, device='cuda:0') tensor(1406.5350, device='cuda:0') tensor(1616.0713, device='cuda:0') tensor(-41.8293, device='cuda:0') tensor(1422.7487, device='cuda:0') tensor(1656.0972, device='cuda:0') tensor(-10.5121, device='cuda:0') tensor(397.1695, device='cuda:0') tensor(444.0505, device='cuda:0') ir1,ir5,ir10,imedr,imeanr,imAP,cr1,cr5,cr10,cmedr,cmeanr,cmAP,rsum ir5-rsum,epoch.28.th,22.89,51.07,63.17,5.00,40.16,36.14,22.30,51.10,62.90,5.00,39.20,35.62,273.43

different time: ........ tensor(-89.9776, device='cuda:0') tensor(4095.6599, device='cuda:0') tensor(5116.2510, device='cuda:0') tensor(-145.8661, device='cuda:0') tensor(4161.9165, device='cuda:0') tensor(5351.6670, device='cuda:0') tensor(-40.3292, device='cuda:0') tensor(1177.1305, device='cuda:0') tensor(1314.6021, device='cuda:0') tensor(-58.3337, device='cuda:0') tensor(4536.5352, device='cuda:0') tensor(4928.3350, device='cuda:0') tensor(35.2728, device='cuda:0') tensor(4343.3838, device='cuda:0') tensor(5280.2969, device='cuda:0') tensor(2.8130, device='cuda:0') tensor(4361.0112, device='cuda:0') tensor(5508.0010, device='cuda:0') tensor(37.5651, device='cuda:0') tensor(1243.3253, device='cuda:0') tensor(1373.3599, device='cuda:0') tensor(-25.2279, device='cuda:0') tensor(1490.6547, device='cuda:0') tensor(1566.4670, device='cuda:0') tensor(7.1009, device='cuda:0') tensor(1408.7480, device='cuda:0') tensor(1670.6154, device='cuda:0') tensor(-34.9750, device='cuda:0') tensor(1403.9734, device='cuda:0') tensor(1701.3884, device='cuda:0') tensor(-7.8403, device='cuda:0') tensor(396.0836, device='cuda:0') tensor(424.8773, device='cuda:0') ir1,ir5,ir10,imedr,imeanr,imAP,cr1,cr5,cr10,cmedr,cmeanr,cmAP,rsum ir5-rsum,epoch.28.th,22.89,51.07,63.17,5.00,40.16,36.14,22.30,51.10,62.90,5.00,39.20,35.62,273.43

cshizhe commented 4 years ago

Hi,

The predictions are the same at different run time. I think the problem you met was caused by inconsistent names in different runs. The outputs I saved is a dict of {vid_names: list, cap_names: list, scores: numpy.array}. Due to the shuffle of names in different runs, the scores should be re-organized.