multimodal Search Results

1000+ results
for multimodal

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

embeddings-benchmark/mteb #1249

Massive Multimodal Extension of MTEB

This issue is an overview of tasks to add for a massive multimodal extension of MTEB. The modalities are: - T=Text - I=Image - A=Audio - V=Video without audio i.e. just multiple images Below is…

Muennighoff updated 1 month ago
1
AdrianBZG/llama-multimodal-vqa #4

constants.py is missing

According to readme, this is the code for training: ``` (llama3-ft) python train.py --dataset_path path/to/dataset.json --output_dir path/to/output_dir --text_model_id="meta-llama/Meta-Llama-3-8B-I…

hyun95roh updated 2 days ago
1
autogluon/autogluon #4537

Expanding AutoGluon-Multimodal to Incorporate Audio: Enhanci…

## Description I'm looking to do my dissertation on the topic of "Expanding AutoGluon-Multimodal to Incorporate Audio: Enhancing AutoML with Voice Data for Multimodal Machine Learning" I was wonde…

youngaryan updated 1 month ago
1
vllm-project/vllm #9633

[Usage]: Multimodal content with benchmark_serving.py

### Your current environment I am running vllm serve with a multimodal (Phi3.5K). How to I run benchmark_serving.py to test the multimodal? In benchmark_serving.py file I see following but test_mm…

khayamgondal updated 4 weeks ago
1
OpenGVLab/VideoMamba #85

MultiModal pretraining issues

After pretraining the model on WebVid, the MSRVTT evaluation results dropped to below 1%. Similarly, when pretraining from the provided pretrained weights, the results also dropped below 1% after the …

francofbv updated 2 months ago
3
modelscope/ms-swift #2354

Multimodal dataset: clarification on mix-data, multiple imag…

**Describe the feature** I have noticed that not all multimodal available here in ms-swift support multi-image, and if they do, the training code might not support it. It is also the case with mix te…

VietDunghacker updated 2 weeks ago
2
ggerganov/llama.cpp #8010

server: Bring back multimodal support

Multimodal has been removed since https://github.com/ggerganov/llama.cpp/pull/5882 Depends on the refactoring of `llava`, we will be able to bring back the support: https://github.com/ggerganov/lla…

ngxson updated 3 days ago
24
shufangxun/LLaVA-MoD #5

CUDA OOM issues

Hello, I've been trying to qwen2 0.5B and tinyclip using the repository, but I'm running into CUDA OOM issues on the dense2dense distillation step. Im running on 4 80GB A100s, I was wondering if I …

pumetu updated 1 week ago
3
Yellow-Dog-Man/Resonite-Issues #2939

Support for Quest Multimodal tracking

### Is your feature request related to a problem? Please describe. In version 20.11.0 ALVR added Multimodal tracking support, allowing fingers to be tracked while holding the controllers (both fing…

cerealbowlsystem updated 1 month ago
5
pytorch/torchtune #1704

Create multimodal instruction dataset builder

We current have `multimodal_chat_dataset` which is great for conversations on an image, but many VQA datasets are structured more like instructions where there is a question column, answer column, and…

RdoubleA updated 1 month ago
1

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for multimodal

1000+ results
for multimodal