-
Can I train with a monocular video or a personal dataset of 6-8 multimodal videos? How to train
-
Thank you for your great work! Please consider include MMEvol:
## Datasets of Multimodal Instruction Tuning
| Name | Paper | Link | Notes |
|:-----|:-----:|:----:|:-----:|
| **MMEvol** | [MMEvol…
-
Hello,
I've been trying to qwen2 0.5B and tinyclip using the repository, but I'm running into CUDA OOM issues on the dense2dense distillation step. Im running on 4 80GB A100s, I was wondering if I …
-
hello. I am interested in your work and appreciate you opening up your code.
I have one request, would you be willing to put the ActivityNet and DiDeMo datasets on Google Drive like the others?
…
-
I hope this message finds you well. I recently came across your excellent paper, M2Fusion: Multi-time Multimodal Fusion for Prediction of Pathological Complete Response in Breast Cancer, and found the…
-
We current have `multimodal_chat_dataset` which is great for conversations on an image, but many VQA datasets are structured more like instructions where there is a question column, answer column, and…
-
### 🚀 The feature
Starting this issue to track minimal examples we can create to demonstrate effective usage and value of TorchData nodes. I can create separate issues for each of these as required…
-
Hi ! I'm Quentin from Hugging Face :)
Congrats on this project, this has the potential to help the community so much ! Especially with large scale and multimodal datasets.
I was wondering if you…
-
@InProceedings{pmlr-v235-ying24a,
title = {{MMT}-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask {AGI}},
author = {Ying, Kaining…
-
## Project info
**Title**:
bidsme
**Project lead**:
Nikita Beliy, @nbeliy
**[Timezone](https://github.com/ohbm/hackathon2020/blob/master/.github/ISSUE_TEMPLATE/handbooks/projects.md#t…