bitsandbytes be decoupled from huggingface

bitsandbytes-foundation / bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

https://huggingface.co/docs/bitsandbytes/main/en/index

MIT License

6.04k stars 606 forks source link

bitsandbytes be decoupled from huggingface #549

Closed RanchiZhao closed 1 year ago

RanchiZhao commented 1 year ago

I'd like to know if bitsandbytes can be decoupled from huggingface, or if they have to be used together. In addition, is the int4 quantization process completed during the get_accelerate_model phase and unrelated to the subsequent training with Trainer? Or, at what point does the dequantization process occur? This is because I've noticed that int4 quantization alters the shape of the linear layer.

RanchiZhao commented 1 year ago

Recently, I have finally adapted QLoRa for frameworks other than HuggingFace. The focus of the adaptation is primarily on the operators in Linear4bit and a series of quantized optimizers. Of course, during this process there are quite al lot dtype errors.

rayrayraykk commented 1 year ago

could you share your ideas or repo? Many thanks!

RanchiZhao commented 1 year ago

could you share your ideas or repo? Many thanks!

you could see this, i hope it will be helpful: https://github.com/RanchiZhao/bmtrain_qlora/issues/1#issuecomment-1702143549