multimodal-transformer Search Results

huggingface/pytorch-image-models #2324

[BUG] model that has LayerScale architecture will have its '…

**Describe the bug** When using models such as `vit_large_patch14_reg4_dinov2.lvd142m` which has a `LayerScale` architecture in it, and use `from_pretrained` to load it will result in incompatible b…

Rkyzzy updated 5 days ago

NVIDIA/TensorRT-LLM #2250

[issue] C++ runtime support multimodal model llava-one-visi…

How to support the new model in cpp runtime ? Is there any reference document ? For example, the multimodal model [llava-one-vision](https://huggingface.co/lmms-lab/llava-onevision-qwen2-7b-ov) Foll…

deepindeed2022 updated 5 days ago

LiBingyu01/StitchFusion-StitchFusion-Weaving-Any-Visual-Modalities-to-Enhance-Multimodal-Semantic-Segmentation #2

Please correctly cite MMSFormer in your paper

I appreciate that you cited and compared with my work **MMSFormer** in your paper. However, you are citing it incorrectly and an older version of the paper. The MMSFormer paper is accepted in [OJSP…

kaykobad updated 1 month ago

DinghaoXi/multimodal_2021 #1

Multimodal-Transformer

你好，请问大佬解决了[Multimodal-Transformer](https://github.com/yaohungt/Multimodal-Transformer)那篇代码里面的问题吗？如果可以的话，请大佬留一个邮箱，十分感激

NonTerraePlusUltra updated 2 years ago

EleutherAI/lm-evaluation-harness #2360

[multimodal] llava-1.5-7b-hf doesn't work on `mmmu_val`

Reproduction: ``` lm_eval --model hf-multimodal \ --model_args pretrained=llava-hf/llava-1.5-7b-hf,max_images=1 \ --tasks mmmu_val \ --device cuda:0 \ --batch_size 8 ``` Erro…

BabyChouSr updated 1 month ago

NVIDIA/TensorRT-LLM #1900

support llava-next model

Llama-next-image and Llama-next-image are fairly good multimodal models, and they are already supported in transformers. I would like to know if tensorrt-llm plans to support these two models? h…

AmazDeng updated 1 week ago

aws/sagemaker-distribution #460

Upgrade Transformers Library

### 🚀 The feature, motivation and pitch Looking for the transformers library to be upgraded to support LLAMA-3.1 models in sagemaker environments. https://github.com/huggingface/transformers/rele…

hemandee updated 2 months ago

rhymes-ai/Aria #36

cannot import name '_MULTIMODAL_MODELS' from 'vllm.model_exe…

I tried to run Aria video notebook with vllm 0.6.3 but I got the following error. Can you check? # load Aria model & tokenizer with vllm import os os.environ["CUDA_VISIBLE_DEVICES"] = "0" im…

andyluo7 updated 2 days ago

prefix-dev/pixi #1784

Pixi not able to solve pypi packages with compatible release…

### Checks - [X] I have checked that this issue has not already been reported. - [X] I have confirmed this bug exists on the [latest version](https://github.com/prefix-dev/pixi/releases) of pixi, us…

hrz6976 updated 2 months ago

linkedin/Liger-Kernel #73

Request to support the Flux model (T2I diffusion transformer…

### 🚀 The feature, motivation and pitch This request is to adapt this to improve the training speed of Flux, a diffusion transformer. It's the top model on HuggingFace trending right now and has b…

RefractAI updated 1 week ago

1000+ results for multimodal-transformer

1000+ results
for multimodal-transformer