-
Does the DTW algorithm work in blocks of 3 x K? where K are all the text embeddings, and we try and match it with 3 frame embeddings of the video in one go?
-
Hi @ttengwang
Appreciate you for sharing the code.
I am wondering if you train the base Transformer +LSTM on Youcook2 dataset, i.e. similar to Row 1 and 2 in Table 7 (a).
I am wondering if t…
-
may I get your processed dataset instead of the original dataset
-
Hi, thank you for sharing this interesting work!
I would like to try fine-tuining ClipBERT on other video-and-language dataset, such as YouCook2.
My target downstream task is cross-modal retrieval…
-
Hi,
now, I execute the step about **Prepare the data to train the Agent.**
Does this code need to execute very long time because now I spent one day to run this python file?
`python download_yo…
-
Hello, I am trying to evaluate VATT on YouCook2 dataset for text-video retrieval. I am having errors trying to load a previous checkpoint among many other package issues with tensorflow v2.7, DMVR, an…
-
**Describe the bug**
When attempting to run `precompute_text.py`, there is an issue.
**To Reproduce**
```
python precompute_text.py youcook2 --cuda --metadata_name "all_${PERTURBATION}" --da…
-
I have an error: “Unexpected key(s) in state_dict: ‘epoch’, ‘netG_state_dict’, ‘optimizer_state_dict’.” when resume training. (below lines are full error, and I added my trainer_vlc.py code at the bot…
-
when I test your model on youcook2 data, the performance is not very good.
I am not sure what is the main reason behind it. have you conducted similar experiments before?
youcook2 is longer than…
-
How do I go about extracting Coot visual embeddings only for a new dataset? Could you share code for creating the input h5 feature files?