johnsmith0031 / alpaca_lora_4bit

MIT License
533 stars 86 forks source link

Train lora with embed_tokens and lm_head #17

Open KohakuBlueleaf opened 1 year ago

KohakuBlueleaf commented 1 year ago

I have tried to train lora+embed+head on peft and 8bit. And it just works great. But since bnb can be set to ignore some specific layer so I can do this easily. I am curious that can I use this repo and train lora+embed+head with other part 4bit?

johnsmith0031 commented 1 year ago

It's technically feasible. Because you can reconstruct the fp16 weight from 4bit weight, replace the specified modules in original model, and set require_grads to be True.

KohakuBlueleaf commented 1 year ago

@johnsmith0031 Will try it! And when I use the newest GPTQ repo to build the 4bit weight. it cannot do forward normally (said imcompatible arguments type)

is there any tested commit and tested model ckpt? And should I set groupsize when I pack the model ckpt?

johnsmith0031 commented 1 year ago

Their code is for new checkpoints, but temporarily not found any 4-bit checkpoints for llama quantized by the new method released yet...

KohakuBlueleaf commented 1 year ago

@johnsmith0031 I just cloned few hrs ago. But my ckpt even can't run with newest code. Maybe I need to requantize again...

BTW. use the .cu from your repo should be fine right?

johnsmith0031 commented 1 year ago

Yes it use the old method.

KohakuBlueleaf commented 1 year ago

@johnsmith0031 So I should use old commit for GPTQ and use old ckpt to fit your .cu? Do I understand right?

johnsmith0031 commented 1 year ago

Yes. As long as you compile the extension from this repo and copy autograd_4bit.py to GPTQ path, you can use it for old ckpt.