-
||link|
|----|---|
|paper| [HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips](https://openaccess.thecvf.com/content_ICCV_2019/papers/Miech_HowTo100M_Learni…
-
# Project Request
Automatic captioning of videos based on cooking by understanding the action behind the scene and providing Nutrtional information based on the ingredients used.
| Field | D…
-
### Week of 23rd Sep
- [ ] Revisit embedding finetuning
- [ ] ColPali video search example
### Week of 16th Sep
- [ ] Chunking experiments and report
- [x] Publish ColPali blog/guide & recipe
- [x] C…
-
We're starting to have our first prior now. (PR of the training script coming soon)
Time to evaluate
Ideas:
* [x] Mse on test set
* [ ] Zero shot eval on image net Class Text -> text emb -> imag…
-
I am using GoPro Max with firmware v2.0. The scaled raw magnetometer readings are very wrong. I am using scaled data.
I took the GoPro outside with minimal ferromagnetic materials and even then the …
-
Reply to this issue with a link to your slides to present in class. Presentation should be 5 minutes or less. Make sure your slides are public.
-
```
What steps will reproduce the problem?
1. onMetaData: function (clip) {
console.log(clip.metaData);
}
What is the expected output? What do you see instead?
Expected: onMetaData is a singu…
-
```
What steps will reproduce the problem?
1. onMetaData: function (clip) {
console.log(clip.metaData);
}
What is the expected output? What do you see instead?
Expected: onMetaData is a singu…
-
Hi authors, thank you for sharing your work, I appreciate that.
I'm trying to utilize the pretrained model of ViT-L/14 for my video-text-retrieval application.
I followed the link to download ViT-L/…
-
Hellow , nice job !
I can not reproduce the MSRVTT finetuned model,and I set each args as the [log](https://pjlab-gvm-data.oss-cn-shanghai.aliyuncs.com/internvideo/retrieval/msrvtt/kc4_finetune_1e-32…