Closed srikant86panda closed 3 weeks ago
Thanks for reporting, however, I'm unable to reproduce what you describe with the main version. Can you try with it?
pip install git+https://github.com/huggingface/trl.git
accelerate launch examples/scripts/dpo_visual.py \
--dataset_name HuggingFaceH4/rlaif-v_formatted \
--model_name_or_path llava-hf/llava-1.5-7b-hf \
--per_device_train_batch_size 2 \
--gradient_accumulation_steps 32 \
--dataset_num_proc 32 \
--output_dir dpo_idefics_rlaif-v \
--bf16 \
--torch_dtype bfloat16 \
--gradient_checkpointing \
--use_peft \
--lora_target_modules=all-linear \
--sanity_check
@qgallouedec Thank you for checking. As per your suggestion, I tried installing the latest code from GitHub. I can confirm that I am now able to perform DPO with LLAVA-1.5-7B without any issues.
Wonderful 🤗
Error: TypeError: LlavaProcessor.call() got an unexpected keyword argument 'add_special_tokens'. When trying to run https://github.com/huggingface/trl/blob/main/examples/scripts/dpo_visual.py with llava-hf/llava-1.5-7b-hf. Package Information: trl Version: 0.9.6, transformers: 4.44.0