Closed RanchiZhao closed 1 year ago
Recently, I have finally adapted QLoRa for frameworks other than HuggingFace. The focus of the adaptation is primarily on the operators in Linear4bit and a series of quantized optimizers. Of course, during this process there are quite al lot dtype errors.
could you share your ideas or repo? Many thanks!
could you share your ideas or repo? Many thanks!
you could see this, i hope it will be helpful: https://github.com/RanchiZhao/bmtrain_qlora/issues/1#issuecomment-1702143549
I'd like to know if bitsandbytes can be decoupled from huggingface, or if they have to be used together. In addition, is the int4 quantization process completed during the get_accelerate_model phase and unrelated to the subsequent training with Trainer? Or, at what point does the dequantization process occur? This is because I've noticed that int4 quantization alters the shape of the linear layer.