Open KohakuBlueleaf opened 1 year ago
It's technically feasible. Because you can reconstruct the fp16 weight from 4bit weight, replace the specified modules in original model, and set require_grads to be True.
@johnsmith0031 Will try it! And when I use the newest GPTQ repo to build the 4bit weight. it cannot do forward normally (said imcompatible arguments type)
is there any tested commit and tested model ckpt? And should I set groupsize when I pack the model ckpt?
Their code is for new checkpoints, but temporarily not found any 4-bit checkpoints for llama quantized by the new method released yet...
@johnsmith0031 I just cloned few hrs ago. But my ckpt even can't run with newest code. Maybe I need to requantize again...
BTW. use the .cu from your repo should be fine right?
Yes it use the old method.
@johnsmith0031 So I should use old commit for GPTQ and use old ckpt to fit your .cu? Do I understand right?
Yes. As long as you compile the extension from this repo and copy autograd_4bit.py to GPTQ path, you can use it for old ckpt.
I have tried to train lora+embed+head on peft and 8bit. And it just works great. But since bnb can be set to ignore some specific layer so I can do this easily. I am curious that can I use this repo and train lora+embed+head with other part 4bit?