-
The article mentions that "where they randomly chose 5 ground-truth sentences per video. We use the same setting when we compare with that approach".Does the training set, validation set and test set …
-
The original msrvtt folder structure is the below.
msrvtt
├── annotation
│ ├── MSR_VTT.json
├── high-quality
│ ├── structured-symlinks
│ │ ├── jsfusion_val_caption_idx.p…
-
Thank you for your great work! Could you please share the code for video-text retrieval evaluation on MSR-VTT dataset?
-
Hi,
Thanks for your excellent work. I have a few questions when I re-implement CLIP4Clip on the MSR-VTT dataset.
Firstly, I change ```sim_header ``` to ```seqTransf``` to implement the best perf…
-
![image](https://user-images.githubusercontent.com/55907441/149606001-dd677291-cb45-4edf-ba3b-cc5546e12988.png)
It seems that caption and video must be one by one pairing diagonally .
I am tryin…
-
When the driver is loaded via `modprobe facetimehd` the following error is generated:
`Direct firmware load for facetimehd/1871_01XX.dat failed with error -2`
More details on my system:
```
$ sudo d…
-
Can you explain the number of thread reader in the training configuration? I can adjust this value to decrease my training time? (Why --num_thread_reader=0 in MSR-VTT while --num_thread_reader=2 in ot…
-
Hi,
thanks for the great work(s) and this great repo~
I have a (maybe very beginner) zero-shot performance reproduction question about ViCLIP on MSRVTT.
Based on my understanding, I use the d…
-
Hello,
Thank you for the repo and well done for the project.
I have a question on how and if it's possible to train on a single gpu.
-
你好,请问对于Activity-Net数据集,max_words 与max_frames 都是64的情况下,v_rate0到t_rate1都是保持原来的MSR-VTT的标准吗,以及Activity-Net的训练的Batchsize是64还是128?
ghost updated
8 months ago