Can't reproduce zero-shot results on MSR-VTT

Hi, Thanks for your interesting work! I'm running your ./zeroshot_scripts/eval_msrvtt.sh script, changing nothing in parameters, but I get a bit different results from yours:

INFO:logger:DSL Text-to-Video:
INFO:logger:    >>>  R@1: 38.0 - R@5: 65.0 - R@10: 73.5  - Median R: 2.0 - Mean R: 27.0
INFO:logger:DSL Video-to-Text:
INFO:logger:    >>>  V2T$R@1: 41.0 - V2T$R@5: 65.2 - V2T$R@10: 74.4  - V2T$Median R: 2.0 - V2T$Mean R: 20.6
INFO:logger:------------------------------------------------------------
INFO:logger:Text-to-Video:
INFO:logger:    >>>  R@1: 35.4 - R@5: 58.3 - R@10: 68.4  - Median R: 3.0 - Mean R: 32.3
INFO:logger:Video-to-Text:
INFO:logger:    >>>  V2T$R@1: 31.7 - V2T$R@5: 54.5 - V2T$R@10: 65.1 - V2T$Median R: 4.0 - V2T$Mean R: 34.6

I have several questions:

Were the results in the paper and README file got with dual softmax loss?
Are the parameters in zero-shot script the same as ones you used in your zero-shot experiment?
Do you think that some minor dependencies can affect results in such way? If so, can you publish a file with your environment's dependencies?

OpenGVLab / InternVideo

Can't reproduce zero-shot results on MSR-VTT #32