-
- [x] MiniCPM-Llama3-V-2_5
- [x] Florence 2
- [x] Phi-3-vision
- [x] Bunny
- [x] Dolphi-vision-72b
- [x] Llava Next
- [ ] Idefics 3
- [ ] Llava Interleave
- [ ] Llava onevision
- [ ] internlm…
-
Please let us know what model architectures you would like to be added!
**Up to date todo list below. Please feel free to contribute any model, a PR without device mapping, ISQ, etc. will still be …
-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…
-
llava-phi-3-mini uses the Phi-3-instruct chat template. I think is similar with current llava-1-5, but with Phi3 instruct template instead of llama 2.
format:
`\nQuestion \n`
stop word is
for…
-
Do you have any plans to support multimodal LLMs, such as MiniGPT-4/MiniGPT v2 (https://github.com/Vision-CAIR/MiniGPT-4/) and LLaVA (https://github.com/haotian-liu/LLaVA/)? That would be a significan…
-
Any chance we could see a variant of each produced with the Llava 1.6 architecture? Thanks
-
### Describe the bug
When I toggle the option for multimodal the software crashes.
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Reproduction
Turning on …
-
That model is insane for its size ....
https://huggingface.co/microsoft/Phi-3-vision-128k-instruct
-
## 🐛 Bug
## To Reproduce
Using this model [Phi-3-vision-128k-instruct](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct)
I got some bugs, need your help !!!
For phi3-v problem, w…
-
I have been playing with most multimodal models based on LLaVA models and I can tell that mini Gemini (the 13B version) is one of the best if not the best for its size.
Keep on the good work and h…