-
**Describe the bug**
When using models such as `vit_large_patch14_reg4_dinov2.lvd142m` which has a `LayerScale` architecture in it, and use `from_pretrained` to load it will result in incompatible b…
-
How to support the new model in cpp runtime ? Is there any reference document ? For example, the multimodal model [llava-one-vision](https://huggingface.co/lmms-lab/llava-onevision-qwen2-7b-ov)
Foll…
-
I appreciate that you cited and compared with my work **MMSFormer** in your paper. However, you are citing it incorrectly and an older version of the paper.
The MMSFormer paper is accepted in [OJSP…
-
你好,请问大佬解决了[Multimodal-Transformer](https://github.com/yaohungt/Multimodal-Transformer)那篇代码里面的问题吗?如果可以的话,请大佬留一个邮箱,十分感激
-
Reproduction:
```
lm_eval --model hf-multimodal \
--model_args pretrained=llava-hf/llava-1.5-7b-hf,max_images=1 \
--tasks mmmu_val \
--device cuda:0 \
--batch_size 8
```
Erro…
-
Llama-next-image and Llama-next-image are fairly good multimodal models, and they are already supported in transformers. I would like to know if tensorrt-llm plans to support these two models?
h…
-
### 🚀 The feature, motivation and pitch
Looking for the transformers library to be upgraded to support LLAMA-3.1 models in sagemaker environments.
https://github.com/huggingface/transformers/rele…
-
I tried to run Aria video notebook with vllm 0.6.3 but I got the following error. Can you check?
# load Aria model & tokenizer with vllm
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
im…
-
### Checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the [latest version](https://github.com/prefix-dev/pixi/releases) of pixi, us…
-
### 🚀 The feature, motivation and pitch
This request is to adapt this to improve the training speed of Flux, a diffusion transformer.
It's the top model on HuggingFace trending right now and has b…