-
I have trained qlora model with unsloth and I want to serve with vllm but I did not found a way to serve model in8/4 bits ?
-
Are there plans to integrate QLora to this tuner? Does it require structural changes to support it?
https://github.com/artidoro/qlora
It's already great as is; but the 4bit quantized models are si…
-
### Feature request
Add 4-bit quantization support when bitsandbytes releases.
### Motivation
Run larger models easily and performantly
### Your contribution
I could make a PR if this is a reaso…
-
https://github.com/artidoro/qlora
https://arxiv.org/abs/2305.14314
> We present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a s…
-
Traceback (most recent call last):
File "./train_qlora.py", line 235, in
main()
File "./train_qlora.py", line 224, in main
train_result = trainer.train()
File "/usr/local/lib/pytho…
-
During fine-tuning, it's possible that special tokens are added that are specific to the adapter. During decoding, we should be using the special tokens, and ensure the correct stop tokens, padding, e…
-
Hi there,
Not sure where it is relevant here.
Is CTranslate2 going to support QLoRA? Please see the following paper for more information:
https://arxiv.org/abs/2305.14314
Thanks.
-
Because of the following LLM-Leaderboard measurements, I want to perform QLoRA DPO without previous QLoRA SFT:
```
alignment-handbook/zephyr-7b-dpo-qlora: +Average: 63.51; +ARC 63.65; +HSwag …
-
### Please check that this issue hasn't been reported before.
- [X] I searched previous [Bug Reports](https://github.com/OpenAccess-AI-Collective/axolotl/labels/bug) didn't find any similar reports…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### Reproduction
CUDA_VISIBLE_DEVICES=1 llamafactory-cli example/......
below is the yaml file:
# model
model_name_or_…