Closed cty8998 closed 3 months ago
Hi, thanks for your interests. The results of CLIP4Clip-meanP and CLIP4Clip-seqTransf used in our paper were reported in Table 4 of CLIP2TV.
As for the split of DiDeMo, we straightly utlized the preprocess of paper X-CLIP. They seem to do some extra processing compared to CLIP4clip.
The previous training log for MSVD was not reserved, and maybe I should take some time to re-train it. Thank you for your patience.
So can you provide me the detailed number of test/val/train samples on DiDeMo dataset ? Just like I shown above. This does not need training. Besides, the 'run_msvd.sh' file in your code has a para "-- init model", what should this para be ?
Hi, I test it again, and the following is the detailed number:
02/27/2024 20:24:05 - INFO - ***** Running test *****
02/27/2024 20:24:05 - INFO - Num examples = 824
02/27/2024 20:24:05 - INFO - Batch size = 24
02/27/2024 20:24:05 - INFO - Num steps = 35
02/27/2024 20:24:05 - INFO - ***** Running val *****
02/27/2024 20:24:05 - INFO - Num examples = 842
02/27/2024 20:24:09 - INFO - ***** Running training *****
02/27/2024 20:24:09 - INFO - Num examples = 7101
02/27/2024 20:24:09 - INFO - Batch size = 24
02/27/2024 20:24:09 - INFO - Num steps = 5900
Besides, the parameter "-- init model" is for model resuming, when you want to continue to train the previous model. If you want to train from scratch, you can remove this line in the script file.
OK , I got it, Thank you so much !
Hello, when I use your source code to train and evaluate on the DiDeMo dataset, the number of test/val/train samples are shown as
However, from paper "Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval (ICCV 2023)", the number of test/val/train samples should be 1004/1065/8392. I have also confirmed with the author of ProST. He used the test/val/train setting. Besides, I can not find the same results of CLIP4Clip-meanP and CLIP4Clip-seqTransf in your paper on the DiDeMo dataset. These results look better than those reported in other papers? Since I want to use the same baseline model with your paper, I must make sure the correct number of training and testing samples. Could you provide the log of corresponding dataset ?
If it is convenient, please also provide the training log of the MSVD dataset.
Looking forward to your timely reply.