huggingface / trl

Train transformer language models with reinforcement learning.
http://hf.co/docs/trl
Apache License 2.0
9.28k stars 1.16k forks source link

Unexpected Keyword Argument 'add_special_tokens' in LlavaProcessor with LLAVA-1.5-7B #1939

Closed srikant86panda closed 3 weeks ago

srikant86panda commented 4 weeks ago

Error: TypeError: LlavaProcessor.call() got an unexpected keyword argument 'add_special_tokens'. When trying to run https://github.com/huggingface/trl/blob/main/examples/scripts/dpo_visual.py with llava-hf/llava-1.5-7b-hf. Package Information: trl Version: 0.9.6, transformers: 4.44.0

qgallouedec commented 4 weeks ago

Thanks for reporting, however, I'm unable to reproduce what you describe with the main version. Can you try with it?

pip install git+https://github.com/huggingface/trl.git
accelerate launch examples/scripts/dpo_visual.py \
    --dataset_name HuggingFaceH4/rlaif-v_formatted \
    --model_name_or_path llava-hf/llava-1.5-7b-hf \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 32 \
    --dataset_num_proc 32 \
    --output_dir dpo_idefics_rlaif-v \
    --bf16 \
    --torch_dtype bfloat16 \
    --gradient_checkpointing \
    --use_peft \
    --lora_target_modules=all-linear \
    --sanity_check
srikant86panda commented 3 weeks ago

@qgallouedec Thank you for checking. As per your suggestion, I tried installing the latest code from GitHub. I can confirm that I am now able to perform DPO with LLAVA-1.5-7B without any issues.

qgallouedec commented 3 weeks ago

Wonderful 🤗