-
Hi, thanks for open sourcing your excellent work. Can I ask what FPS the video training is done at? In your video inference example you uniformly sample 16 frames, which works out to be slightly less …
-
Hi, thanks for the great benchmark! Are there any official results of LLaVA-Next-video-7B? My implementation seems too low.
-
### The model to consider.
LLaVA-NeXT-Video* (LlavaNextVideoForConditionalGeneration)
### The closest model vllm already supports.
Llava-Next (LlavaNextForConditionalGeneration)
### What's your …
-
Thank you for your excellent work.
I believe llava-1.6 currently supports 7b/13b models, but
do you have any plans to expand this to larger models (such as llava-hf/llava-v1.6-34b-hf, llava-hf/llama…
-
I have a question regarding the AnyRes feature in LLaVA-NeXT-Video. The documentation mentions that AnyRes enables high-resolution image processing. However, when examining the demo code at https://gi…
-
The llava-next-video-34b DPO model is not performing well, whereas the 7B-dpo model works fine.
I've reviewed related issues and tried **_changing the conv mode to mistral_direct_**, but the respon…
-
Hi!
I wonder know whether you have the plan to release the checkpoint of LLaVA-NeXT-Video stage 1, the pretrain version.
I want to finetune the model from this stage!
-
I'm not sure what's going on after setting up proper environments and test for the first inference tackling the single image input with LLaVA OneVision.
```
----------------------------------------…
-
Checkout here to see the three yamls.
https://github.com/LLaVA-VL/LLaVA-NeXT/tree/main/scripts/train
-
I encountered the following error when running the ’**finetune_onevision.sh**‘ script using the ‘**mm_projector.bin**’ file provided at [this link](https://huggingface.co/lmms-lab/llava-onevision-proj…