Open richardburleigh opened 1 year ago
Did you applied the monkey patch to peft?
@johnsmith0031 Sorry, can you please elaborate? I've been stuck trying to finetune a GPTQ model for days.
I'm running finetune.py
directly, where would the monkey patch be applied?
python finetune.py ./output.txt \
--ds_type=txt \
--lora_out_dir=./test/ \
--llama_q4_config_dir=./TheBloke_Stable-Platypus2-13B-GPTQ/config.json \
--llama_q4_model=./TheBloke_Stable-Platypus2-13B-GPTQ/model.safetensors \
--mbatch_size=1 \
--batch_size=1 \
--epochs=3 \
--lr=3e-4 \
--cutoff_len=256 \
--lora_r=8 \
--lora_alpha=16 \
--lora_dropout=0.05 \
--warmup_steps=5 \
--save_steps=50 \
--save_total_limit=3 \
--logging_steps=5 \
--groupsize=128 \
--xformers \
--backend=cuda
It's in the finetune.py file.
from alpaca_lora_4bit.monkeypatch.peft_tuners_lora_monkey_patch import replace_peft_model_with_int4_lora_model
replace_peft_model_with_int4_lora_model()
The two lines you quoted are intact. Is it possible that the monkeypatch failed to be applied?
Solution found at #148 (See this comment)
I'm getting the following error when trying to load a model using load_llama_model_4bit_low_ram_and_offload . Any ideas?
Target module Autograd4bitQuantLinear() is not supported. Currently, only
torch.nn.Linear
andConv1D
are supported.