-
### What is the issue?
I tried to import finetuned llama-3.2-11b-vision, but I got "Error: unsupported architecture."
In order to make sure my model is not the problem, I downloaded [meta-llama/Ll…
-
Hi, thanks for your work.
When I run the demo code from: https://huggingface.co/lmms-lab/LLaVA-Video-72B-Qwen2 in your LLaVA-NeXT repository, some errors happened:
```
size mismatch for vision_mode…
-
reported by RauelON
-
i saw https://huggingface.co/Vision-CAIR/LongVU_Llama3_2_1B exists .
Is it image or video part ?
could it be combined with LongVU_Llama3_2_3B ? (image or video) and what
hardware requirements ?
-
Hey, MIDIJack seems like the best solution for MIDI Output with Unity!.. I'm working on a few MIDI experiments with Apple Vision Pro at the moment - would it be possible to recompile the libs for the …
-
Hi,
I am trying to load a Phi-3.5-3.8B-vision-instruct-Q8_0 GGUF model using the command for loading local GGUF file:
./mistralrs-server -i gguf --quantized-model-id path/to/files --quantized-f…
-
I get more than 80,000 words when I use the Ollama Vision node, unbelievable!!! The Ollama model I use is llama3.2-vision:11b, I am not sure if that model's problem or others.
This is quite likely to…
-
I would like to use this package with a normal images, instead of a frame.
How to convert a frame in a normal images?
Or, is it possible to take a frame screenshot? I always get the error that …
-
## 🚀 Feature
Add a field to the `ModelRecord` to indicate whether a model is a vision model like `Phi-vision` models.
```
data class ModelRecord(
@SerializedName("model_url") val modelUrl: …
-
### Feature request
Is it possible to upload an example of how to finetune PaLIGemma on multi-image inputs?
Something similar to [multi-image-inference](https://huggingface.co/docs/transformers/ma…