-
Can I train with a monocular video or a personal dataset of 6-8 multimodal videos? How to train
-
Hello,
I've been trying to qwen2 0.5B and tinyclip using the repository, but I'm running into CUDA OOM issues on the dense2dense distillation step. Im running on 4 80GB A100s, I was wondering if I …
-
hello. I am interested in your work and appreciate you opening up your code.
I have one request, would you be willing to put the ActivityNet and DiDeMo datasets on Google Drive like the others?
…
-
We current have `multimodal_chat_dataset` which is great for conversations on an image, but many VQA datasets are structured more like instructions where there is a question column, answer column, and…
-
I hope this message finds you well. I recently came across your excellent paper, M2Fusion: Multi-time Multimodal Fusion for Prediction of Pathological Complete Response in Breast Cancer, and found the…
-
Hi ! I'm Quentin from Hugging Face :)
Congrats on this project, this has the potential to help the community so much ! Especially with large scale and multimodal datasets.
I was wondering if you…
-
### 🚀 The feature
Starting this issue to track minimal examples we can create to demonstrate effective usage and value of TorchData nodes. I can create separate issues for each of these as required…
-
## Project info
**Title**:
bidsme
**Project lead**:
Nikita Beliy, @nbeliy
**[Timezone](https://github.com/ohbm/hackathon2020/blob/master/.github/ISSUE_TEMPLATE/handbooks/projects.md#t…
-
Is there any versions for the model of **Visualized BGE based on BAAI/bge-base-zh-v1.5**?And how does the BAAI/bge-visualized-m3 performance compared with ChineseCLIP?
-
Hi, thanks for this great work! I noticed in your paper you mentioned you're evaluating on more multimodal datasets, like VQAv2 and OKVQA. Do you have any results for those now, or any timeline for wh…