AGI-Edgerunners / LLM-Adapters

Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"
https://arxiv.org/abs/2304.01933
Apache License 2.0
1.05k stars 99 forks source link

Can't fine tune/train when the model is loaded in 8bit #55

Closed Wonigox closed 8 months ago

Wonigox commented 8 months ago

I tried to use the load_8bit argument to try and train large models but it seems that the Trainer is not recognizing the PEFT adapters and is giving the following error: ValueError: You cannot perform fine-tuning on purely quantized models. Please attach trainable adapters on top of the quantized model to correctly perform fine-tuning. Please see: https://huggingface.co/docs/transformers/peft for more details

The exact command I gave was CUDA_VISIBLE_DEVICES=0 python finetune.py --base_model 'bigscience/bloomz-1b1' --data_path 'ft-training_set/math_10k.json' --output_dir './trained_models/bloomz-1b1-bottleneck/' --batch_size 16 --micro_batch_size 4 --num_epochs 3 --learning_rate 3e-4 --cutoff_len 256 --val_set_size 0 --eval_step 80 --save_step 80 --adapter_name bottleneck --load_8bit --target _modules '["dense_4h_to_h"]' (I used Bloomz 1.1B just for testing since larger models take too long to download)

Is there some other process I must take to train in 8bit or might this be an issue related to an incompatibility with the custom PEFT package and the transformers package?

The versions of some of the relevant packages are as such:

bitsandbytes              0.42.0
torch                     2.0.1
transformers              4.37.0.dev0
cuda                      11.7

Thanks

HZQ950419 commented 8 months ago

Hi,

Please try without --load_8bit. It works for me on bigscience/bloomz-7b1 but bigscience/bloomz-1b1 still can't. We are trying to fix it. The other models like LLaMA-7B should work well.

Wonigox commented 8 months ago

I just tried again with LLaMA-7B and still can't get it to work. The command line I used was this:

CUDA_VISIBLE_DEVICES=0,1,2 python finetune.py \
--base_model 'yahma/llama-7b-hf' \
--data_path 'ft-training_set/math_10k.json' \
--output_dir './trained_models/llama-bottleneck-math10k/' \
--batch_size 16 \
--micro_batch_size 4 \
--num_epochs 3 \
--learning_rate 3e-4 \
--cutoff_len 256 \
--val_set_size 120 \
--eval_step 80 \
--save_step 80 \
--load_8bit \
--adapter_name bottleneck \
--target_modules '["down_proj"]' 

Is there anything wrong here?

HZQ950419 commented 8 months ago

Hi,

Please try without --load_8bit

Wonigox commented 8 months ago

It works without --load_8bit, but I would like to also try fine-tuning in 8bit mode to test how much GPU memory it saves, how good the tuned model is, how long fine tuning takes, etc. Is there no way to perform fine tuning / training while loading the model in 8bit?

HZQ950419 commented 8 months ago

Maybe you can try with earlier bnb and transformers version, like bitsandbytes==0.37.2 and transformers==4.34.0 or 4.35.0

Wonigox commented 8 months ago

So upon further inspection, I found out that the cause of this problem was in lines 405 and 411-416 of transformers/trainer.py.

405:    _is_peft_model = is_peft_available() and isinstance(model, PeftModel)
406:    _is_quantized_and_base_model = getattr(model, "is_quantized", False) and not getattr(
407:        model, "_hf_peft_config_loaded", False
408:    )
409:
410:    # At this stage the model is already loaded
411:    if _is_quantized_and_base_model and not _is_peft_model:
412:        raise ValueError(
413:            "You cannot perform fine-tuning on purely quantized models. Please attach trainable adapters on top of"
414:            " the quantized model to correctly perform fine-tuning. Please see: https://huggingface.co/docs/transformers/peft"
415:            " for more details"
416:        )
417:    elif _is_quantized_and_base_model and not getattr(model, "_is_quantized_training_enabled", False):
418:        raise ValueError(
419:            "The model you want to train is loaded in 8-bit precision.  if you want to fine-tune an 8-bit"
420:            " model, please make sure that you have installed `bitsandbytes>=0.37.0`. "
421:        )

The is_peft_available() function used by the _is_peft_model variable requires that PEFT is installed as a package, which is False since a custom one is used (also, PeftModel is not defined since the PEFT package is not defined here).

As a workaround, I just replaced line 405 with _is_peft_model = True and now the training seems to work fine when using --load_8bit.

HZQ950419 commented 8 months ago

Thanks for the information! You can also try with cd peft and pip install -e . to install our package to pass the transformers checking. Your solution also works.

Please let us know if you have futher questions!