huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences
https://huggingface.co/HuggingFaceH4
Apache License 2.0
4.28k stars 367 forks source link

Impossible to load local pretrained model for DPO Alignment #54

Closed mathis-lambert closed 8 months ago

mathis-lambert commented 8 months ago

Hi,

I'm using Alignment-Handbook to finetune mistral on a custom dataset. I've made the SFT Full Fine Tuning and it ended up well, and saved my pretrained model to ./data/zephyr-7b-sft-full. Now i want to align my model with DPO, everything is okay with the loading of the dataset but it's impossible to load the pretrained model saved on my local machine. Althought i've tried with absolute and relative path, it always throw this error :

/root/miniconda3/lib/python3.11/site-packages/datasets/table.py:1421: FutureWarning: promote has been superseded by mode='default'.
  table = cls._concat_blocks(blocks, axis=0)
Traceback (most recent call last):
  File "/workspace/work/finetuning/alignment-handbook/scripts/run_dpo.py", line 224, in <module>
    main()
  File "/workspace/work/finetuning/alignment-handbook/scripts/run_dpo.py", line 119, in main
    if is_adapter_model(model, model_args.model_revision):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.11/site-packages/alignment/model_utils.py", line 99, in is_adapter_model
    repo_files = list_repo_files(model_name_or_path, revision=revision)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.11/site-packages/huggingface_hub/utils/_deprecation.py", line 101, in inner_f
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 110, in _inner_fn
    validate_repo_id(arg_value)
  File "/root/miniconda3/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 158, in validate_repo_id
    raise HFValidationError(
huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': './data/zephyr-7b-sft-full'. Use `repo_type` argument if needed.

and no matter how much I change, nothing happens.

Here is the /dpo/config_full.yaml file :

# Model arguments
model_name_or_path: ./data/zephyr-7b-sft-full

# Data training arguments
# For definitions, see: src/h4/training/config.py
# Data training arguments
dataset_mixer:
  /workspace/work/finetuning/tickets_dpo/: 1.0
dataset_splits:
- train
- test
preprocessing_num_workers: 12

# DPOTrainer arguments
bf16: true
beta: 0.1
do_eval: true
evaluation_strategy: steps
eval_steps: 100
gradient_accumulation_steps: 1
gradient_checkpointing: true
#hub_model_id: zephyr-7b-dpo-full
learning_rate: 5.0e-7
log_level: info
logging_steps: 10
lr_scheduler_type: linear
max_length: 1024
max_prompt_length: 1024
num_train_epochs: 5
optim: rmsprop
output_dir: data/zephyr-7b-dpo-full
per_device_train_batch_size: 8
per_device_eval_batch_size: 4
push_to_hub: false
save_strategy: "no"
save_total_limit: null
seed: 42
warmup_ratio: 0.1

Please help :)

huchinlp commented 8 months ago

Since you are using full fine-tuning, you can just skip the check for peft model by commenting the call of is_adapter_model.

mathis-lambert commented 8 months ago

@huchinlp yep I just noticed it and i dont want to skip this step.

The issue has been fixed with the #49 PR thanks to @dmilcevski

But yes it seems ok now DPO Alignment is running ! thanks 🙏