-
We used the same data as in git and used the officially provided training weights, also evaluated using gpt3.5, but only achieved an accuracy of 47.9/3.1 (vs. 70.0) on the TGIF-QA task.BTW, on the oth…
-
When I try to reproduce the demo I run into problems with java, I tried jre-17 and jre-19 but it doesn't seem to be a problem with the java version.
Is there any good solution for this please?
`…
-
Hi authors,
Amazing paper and thanks for providing this nice code base. I have a question regarding the **finetuned model**, specifically for **video-text retrieval task**. Do you have plans to rel…
-
Good afternoon, I'm reading and trying to run your code with the file you have uploaded. But sorry, I didn't successfully run this code, maybe it's because of file “sim_matrix” is not provided. Beside…
-
Hello, thank you very much for publishing such a high-level code. When I use your code to run on my personal video dataset, the memory usage of the program is very high, but the RAM of the workstation…
-
-
I couldn't find the 'msrvtt_train.json' file, which should be in 'annotation/msr-vttRET'.
-
-
Hello, I am interested in this work and excited to see the perfect performance of this work. Though the code has many scripts to extract features the model needs. I'm worried that the features are so …
-
Thanks for your impressive work, I have a question to evaluate video-text retrieval: In datasets such as MSVD and MSRVTT, each video is attached with multiple captions. How do you process this problem…