-
llava multimodel would be huge to be supported for aws neuron chips
https://huggingface.co/llava-hf/llava-v1.6-mistral-7b-hf
This in particular is trending
I'm not sure if this is the correct…
-
-
Dear authors,
Thank you for your work and the release of the d-cube dataset.
I was trying to run a pre-trained OWL-ViT model (e.g. "google/owlvit-base-patch32") on the dataset, and found the fol…
-
For example, what transformers version should I use for llava-v1.6-mistral-7b-hf? I have tried many versions, but they all have different errors, such as
1: 4.39.2
File "/weilai/codes/lmms-eval…
-
### 🐛 Describe the bug
I faced the below issue after training vit_h_14 model with pretrained weights. If I do not load pretrained weights, everything is fine.
# how to reproduce this bug
import t…
-
Lora+base is working good
![image](https://github.com/mbzuai-oryx/LLaVA-pp/assets/15274284/ccec0900-7db0-4729-9ab4-3c5f68e0f304)
![image](https://github.com/mbzuai-oryx/LLaVA-pp/assets/15274284/7d12…
-
Not sure why, I have tried all the fixes recommended on this page, including upgrading transformers etc. I have enough memory free, so don't think that is the issue either.
Error occurred when ex…
-
# URL
- https://arxiv.org/abs/2411.05663
# Authors
- Xiwen Wei
- Guihong Li
- Radu Marculescu
# Abstract
- Catastrophic forgetting is a significant challenge in online continual learning (OCL)…
-
* [ CS25 I Stanford Seminar - Transformers United 2023: Biomedical Transformers](https://web.stanford.edu/class/cs25/)
* CS25 I Stanford Seminar - Transformers United 2023: Introduction to Transfor…
-
Hi, thanks for contributing a great library!
I've been doing a close-up study of the `MultiheadAttentionPruner` implementation, and I have some concerns.
The pruning of output channel in out_pro…