ArrowLuo / CLIP4Clip

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
https://arxiv.org/abs/2104.08860
MIT License
880 stars 124 forks source link

What is the meaning of --num_thread_reader=0 in MSR-VTT training configuration? #51

Closed thinh276 closed 2 years ago

thinh276 commented 2 years ago

Can you explain the number of thread reader in the training configuration? I can adjust this value to decrease my training time? (Why --num_thread_reader=0 in MSR-VTT while --num_thread_reader=2 in other dataset.) Thank you so much!

ArrowLuo commented 2 years ago

Hi @thinh276, the --num_thread_reader is used in dataloders, which can speed up the data reading. --num_thread_reader=0 in MSR-VTT can be regarded as a typo and feel free to adjust its value.

thinh276 commented 2 years ago

Hi @thinh276, the --num_thread_reader is used in dataloders, which can speed up the data reading. --num_thread_reader=0 in MSR-VTT can be regarded as a typo and feel free to adjust its value.

Thank you so much! My workstation is runing in a slow speed now. I will test with some values of --num_thread_reader. Does this value affects the accuracy? I use your code (--num_thread_reader=0) and test with 2 computers the results have the a gap:

If --num_thread_reader=0 value affects the training time only. Are my training results normal? Thank you!

ArrowLuo commented 2 years ago

Hi @thinh276, interesting results but I do not think the --num_thread_reader=0 will affect the performance. The difference may be caused by the GPU number, or other factors (not sure), e.g., CUDA's nondeterministic behavior. Below links are for your information,

  1. https://github.com/ArrowLuo/CLIP4Clip/issues/25
  2. https://github.com/openai/CLIP/issues/114
  3. https://pytorch.org/docs/stable/notes/randomness.html

Thanks.

thinh276 commented 2 years ago

I tested --num_thread_reader=2 and the training time decrease from 33 hours to 10 hours. (Great!) Thank you for your links of information. I will read it. I would like to inform you and others my detail resutls:

Batch size/No. of GPU​ | Method​ | R@1​ | R@5​ | R@10​ | MdR​ | MnR​ -- | -- | -- | -- | -- | -- | -- 128/4 GPUs​ | -meanP​ | 42.9​ | 70.7​ | 80.0​ | 2.0​ | 17.0​ 128/4 GPUs​ | -seqLSTM | 42.2 | 69.7 | 80.1 | 2.0​ | 17.2 128/4 GPUs​ | -seqTransf​ | 43.0​ | 70.2​ | 81.2​ | 2.0​ | 16.1​ 128/4 GPUs​ | -seqTransf​ | 40.8 | 72.0​ | 81.8​ | 2.0​ | 14.4​ Batch size/No. of GPU​ | Method​ | R@1​ | R@5​ | R@10​ | MdR​ | MnR​ -- | -- | -- | -- | -- | -- | -- 64/2 GPUs​ | -meanP​ | 43.4​ | 71.3​ | 80.8​ | 2.0​ | 16.6​ 64/2 GPUs​ | -seqTransf | 43.5 | 72.5 | 80.5 | 2.0​ | 14.8

ArrowLuo commented 2 years ago

Hi @thinh276, thank you for your kind sharing. I am not sure the gap is normal or not, and whether the situations mentioned by the above links cause such a problem. If you want to compare to the paper, a choice is to report both the results of the paper and yours for a fair comparison.

thinh276 commented 2 years ago

It's not for comparison. As I undertand, there are many random values so we can not reach exactly the same result. But we can see the difference among similarly calculation methods are clearly. It was success with me when I tried to reproduce experimences. Thank you for your code and for your kind reply!

deepalchemist commented 7 months ago

@thinh276 @ArrowLuo Hello, I train CLIP4Clip model with --sim_header seq_Transf. But it seems that --num_thread_reader=8 results in worse accuracy than --num_thread_reader=0. Do you know why?