modelscope / ms-swift

Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
https://swift.readthedocs.io/zh-cn/latest/Instruction/index.html
Apache License 2.0
4.06k stars 359 forks source link

How to load a model from DPO checkpoint #1786

Closed Lopa07 closed 2 months ago

Lopa07 commented 2 months ago

I did a DPO fine-tuning using the default MP command provided here.

# MP(device map)
CUDA_VISIBLE_DEVICES=0,1 \
swift rlhf \
    --rlhf_type dpo \
    --model_type llava1_6-mistral-7b-instruct \
    --beta 0.1 \
    --sft_beta 0.1 \
    --sft_type  lora \
    --dataset rlaif-v#1000 \
    --num_train_epochs  2  \
    --lora_target_modules  DEFAULT  \
    --gradient_checkpointing  true  \
    --batch_size  1  \
    --learning_rate  5e-5  \
    --gradient_accumulation_steps  16  \
    --warmup_ratio  0.03  \
    --save_total_limit  2

The DPO checkpoints do not seem to have config.json. I cannot load this model using model = AutoModelForCausalLM.from_pretrained(checkpoint_path).

Error:

(py3.9) m.banerjee@PHYVDGPU02PRMV:/VDIL_COREML/m.banerjee/ms-swift$ python
Python 3.9.19 (main, May  6 2024, 19:43:03) 
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from transformers import AutoModelForCausalLM
>>> checkpoint_path = '/VDIL_COREML/m.banerjee/ms-swift/output/llava1_6-mistral-7b-instruct/v3-20240821-135744/checkpoint-50'
>>> model = AutoModelForCausalLM.from_pretrained(checkpoint_path)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/VDIL_COREML/m.banerjee/anaconda3/envs/py3.9/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 524, in from_pretrained
    config, kwargs = AutoConfig.from_pretrained(
  File "/VDIL_COREML/m.banerjee/anaconda3/envs/py3.9/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 976, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/VDIL_COREML/m.banerjee/anaconda3/envs/py3.9/lib/python3.9/site-packages/transformers/configuration_utils.py", line 632, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/VDIL_COREML/m.banerjee/anaconda3/envs/py3.9/lib/python3.9/site-packages/transformers/configuration_utils.py", line 689, in _get_config_dict
    resolved_config_file = cached_file(
  File "/VDIL_COREML/m.banerjee/anaconda3/envs/py3.9/lib/python3.9/site-packages/transformers/utils/hub.py", line 373, in cached_file
    raise EnvironmentError(
OSError: /VDIL_COREML/m.banerjee/ms-swift/output/llava1_6-mistral-7b-instruct/v3-20240821-135744/checkpoint-50 does not appear to have a file named config.json. Checkout 'https://huggingface.co//VDIL_COREML/m.banerjee/ms-swift/output/llava1_6-mistral-7b-instruct/v3-20240821-135744/checkpoint-50/tree/None' for available files.

The DPO checkpoint directory contents are:

(py3.9) m.banerjee@PHYVDGPU02PRMV:/VDIL_COREML/m.banerjee/ms-swift$ ls -l output/llava1_6-mistral-7b-instruct/v3-20240821-135744/checkpoint-50/
total 247672
-rw-r--r-- 1 m.banerjee sra_vdil-coreml       742 Aug 21 14:20 adapter_config.json
-rw-r--r-- 1 m.banerjee sra_vdil-coreml  84378528 Aug 21 14:20 adapter_model.safetensors
-rw-r--r-- 1 m.banerjee sra_vdil-coreml        67 Aug 21 14:20 additional_config.json
-rw-r--r-- 1 m.banerjee sra_vdil-coreml       279 Aug 21 14:20 configuration.json
-rw-r--r-- 1 m.banerjee sra_vdil-coreml       185 Aug 21 14:20 generation_config.json
-rw-r--r-- 1 m.banerjee sra_vdil-coreml 168149138 Aug 21 14:20 optimizer.pt
-rw-r--r-- 1 m.banerjee sra_vdil-coreml      5145 Aug 21 14:20 README.md
-rw-r--r-- 1 m.banerjee sra_vdil-coreml     14244 Aug 21 14:20 rng_state.pth
-rw-r--r-- 1 m.banerjee sra_vdil-coreml      1064 Aug 21 14:20 scheduler.pt
-rw-r--r-- 1 m.banerjee sra_vdil-coreml     10920 Aug 21 14:20 sft_args.json
-rw-r--r-- 1 m.banerjee sra_vdil-coreml      9644 Aug 21 14:20 trainer_state.json
-rw-r--r-- 1 m.banerjee sra_vdil-coreml      8248 Aug 21 14:20 training_args.bin

My hardware and system info: CUDA Version: 12.4 System: Ubuntu 22.04.3 LTS GPU torch==2.4.0 transformers==4.44.0 trl==0.9.6 peft==0.12.0

hjh0119 commented 2 months ago

you may have to merge lora weights by swift export --ckpt_dir xxx --merge_lora true

Lopa07 commented 2 months ago

This solved the issue. Thank you so much!