danieljf24 / dual_encoding

[CVPR2019] Dual Encoding for Zero-Example Video Retrieval
Apache License 2.0
154 stars 31 forks source link

The model is very sensitive to the batch size #15

Open nebuladream opened 4 years ago

nebuladream commented 4 years ago

We train the model on 4-GPUs with different batch size. The results on validation change largely with batch size, for batch=128 we get all_recall=286; batch = 256 we get all_recall=268; batch=1280 all_recall=240; We train the model on 1-GPU with different batch size: batch=128, all_recall=295; batch=256, all_recall=285; We tried different learning rate, but it seems has no affect for the decreasing result. Do you have the similar result?

danieljf24 commented 4 years ago

Sorry, I just trained the model on 1 GPU. The results you posted are interesting. I think it may be caused by the triplet loss with hard example mining. Additionally, I am wondering why you posted all_recall is so high, I only obtained all_recall about 150.

nebuladream commented 4 years ago

Sorry, I just trained the model on 1 GPU. The results you posted are interesting. I think it may be caused by the triplet loss with hard exampling mining. Additionally, I am wondering why you posted all_recall is so high, I only obtained all_recall about 150.

it may because we report all recall on all direction, more details as follow: Text to video: r_1_5_10: [20.433, 47.042, 57.455] medr, meanr: [7.0, 37.884] Video to text: r_1_5_10: [32.998, 62.777, 74.245] medr, meanr: [3.0, 18.048] best sum recall: 294.9496981891348