-
Dear authors,
Thanks for your promising work, I am trying to fine-tune LLaVA-OV on my own datasets, I modified the `finetune_onevision.sh` as follows:
```
export OMP_NUM_THREADS=8
export NCCL_IB…
-
torchrun --nnodes=1 --nproc_per_node=8 --master_port=25001 \
llava/train/train_mem.py \
--model_name_or_path /path/to/checkpoint_llava_med \
--data_path /path/to/your_dental_dataset.jso…
-
When I fine-tune using Lora, the model's convergence effect is not good. The hyperparameters are set as follows:
--lora_enable True \
--deepspeed scripts/zero3.json \
--model_name_or_path …
-
### Question
In training scripts, 'mm_vision_select_layer' is set to be -2, which means the penultimate layer's output of CLIP vision encoder is used as image features. I wonder why not use the last…
-
### System Info
NA
### Who can help?
@muellerz @sunma
### Information
- [ ] The official example scripts
- [X] My own modified scripts
### Tasks
- [ ] An officially supported task in the `examp…
-
What is the Qwen2-VL Max HF Demo config?
https://huggingface.co/spaces/Qwen/Qwen2-VL
In the demo from this repo, i found the setup for 7B, but is Qwen2-VL-Max the same?
Could someone please prov…
-
I finetune llava-one-vision using lmms-lab/llava-onevision-qwen2-7b-ov by config --lora_enable True --lora_r 128 --lora_alpha 256 --mm_projector_lr 2e-5 and have checkpoint saved, how can i using thi…
-
We met an error:
`[2024-09-23 11:13:54,886] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 123969
[2024-09-23 11:13:54,887] [ERROR] [launch.py:321:sigkill_handler] `
with with return co…
-
When I run
`bash scripts/video/demo/video_demo.sh ${the path of LLaVA-NeXT-Video-7B-DPO} vicuna_v1 32 2 True ${the path of video}`
I get the error
```
Can't set vocab_size with value 32000 for …
-
step1
pretrain_projector_image_encoder.sh
step2
pretrain_projector_video_encoder.sh
step3
finetune_dual_encoder.sh
step4
eval/vcgbench/inference/run_ddp_inference.sh
step5
eval/vcgbench/gpt_e…