hiyouga / LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs
Apache License 2.0
25.26k stars 3.13k forks source link

scripts/pissa_init.py to initialize PiSSA for a quantized model. #4463

Closed Katehuuh closed 3 days ago

Katehuuh commented 3 days ago

Reminder

System Info

Reproduction

--finetuning_type lora + --quantization_bit 4 + --pissa_init True

(venv) C:\LLaMA-Factory>set CUDA_VISIBLE_DEVICES=0 && llamafactory-cli train --stage sft --do_train True --model_name_or_path NousResearch/Meta-Llama-3-8B-Instruct --preprocessing_num_workers 16 --finetuning_type lora --quantization_bit 4 --template alpaca --rope_scaling linear --flash_attn fa2 --dataset_dir data --dataset identity --cutoff_len 8192 --learning_rate 5e-05 --num_train_epochs 1.0 --max_samples 100000 --per_device_train_batch_size 1 --gradient_accumulation_steps 1 --lr_scheduler_type cosine --max_grad_norm 1.0 --logging_steps 5 --save_steps 1000 --warmup_steps 0 --neftune_noise_alpha 5 --optim adamw_torch --packing False --upcast_layernorm True --report_to none --output_dir saves\LLaMA3-8B-Chat\lora\QLoRA_identity --bf16 True --plot_loss True --ddp_timeout 180000000 --include_num_input_tokens_seen True --lora_rank 512 --lora_alpha 1024 --lora_dropout 0.15 --create_new_adapter True --pissa_init True --pissa_convert True --lora_target all --pissa_iter 4
...
    exec(code, run_globals)
  File "C:\LLaMA-Factory\venv\Scripts\llamafactory-cli.exe\__main__.py", line 7, in <module>
    sys.exit(main())
  File "C:\LLaMA-Factory\src\llamafactory\cli.py", line 110, in main
    run_exp()
  File "C:\LLaMA-Factory\src\llamafactory\train\tuner.py", line 44, in run_exp
    model_args, data_args, training_args, finetuning_args, generating_args = get_train_args(args)
  File "C:\LLaMA-Factory\src\llamafactory\hparams\parser.py", line 230, in get_train_args
    _verify_model_args(model_args, finetuning_args)
  File "C:\LLaMA-Factory\src\llamafactory\hparams\parser.py", line 94, in _verify_model_args
    raise ValueError("Please use scripts/pissa_init.py to initialize PiSSA for a quantized model.")
ValueError: Please use scripts/pissa_init.py to initialize PiSSA for a quantized model.

Expected behavior

QLoRA and PISSA are not directly supported, scripts/pissa_init.py is not a available script, instead we should use initilize_qpissa.py?

(venv) C:\LLaMA-Factory>python initilize_qpissa.py --base_model_dir NousResearch/Meta-Llama-3-8B-Instruct --output_path Meta-Llama-3-8B-Instruct-pissa-4bit-r128-iter4 --iter 4
bin C:\LLaMA-Factory\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda121.dll
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 4/4 [01:12<00:00, 18.15s/it]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
  0%|                                                                                  | 1/739 [00:03<41:03,  3.34s/it]
Traceback (most recent call last):
  File "C:\LLaMA-Factory\initilize_qpissa.py", line 86, in <module>
    base_layer_in_4bits, base_layer, lora_A, lora_B = pissa_quant(value, args.rank, args.iter)
  File "C:\LLaMA-Factory\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\LLaMA-Factory\initilize_qpissa.py", line 57, in pissa_quant
    weight_nf4, weight_dequantized = quantize_and_dequantized(res)
  File "C:\LLaMA-Factory\initilize_qpissa.py", line 44, in quantize_and_dequantized
    weight_dequantized = bnb.functional.dequantize_4bit(
  File "C:\LLaMA-Factory\venv\lib\site-packages\bitsandbytes\functional.py", line 1018, in dequantize_4bit
    assert absmax is not None and out is not None
AssertionError

Can the quantized conversion feature of PiSSA be implemented in LLaMA-Factory for QLoRA?

Others

No response

hiyouga commented 3 days ago

https://github.com/hiyouga/LLaMA-Factory/blob/main/scripts%2Fpissa_init.py