-
尝试在12G卡上训练 python qlora.py --model_name="chinese_alpaca" --model_name_or_path="./model_hub/chinese-alpaca-7b" --trust_remote_code=False --dataset="msra" --source_max_len=128 --target_max_len=64 --do_t…
-
When I use the command below I got an error:
```shell
python3 qlora.py –learning_rate 0.0001 --model_name_or_path
```
╭─────────────────────────────── Traceback (most recent call last) ─…
-
I anticipate there will be a lot of demand to train (and infer) the new open SOTA image model "Flux".
It's the top model on HF right now. It's a 12B diffusion transformer, which means it's too big t…
-
I have problems resuming a checkpoint. What I did:
1) `python qlora.py --model_name_or_path huggyllama/llama-7b`
2) abort when a checkpoint has been written
3) `python qlora.py --model_name_or_path…
-
![image](https://github.com/user-attachments/assets/8b2886be-25da-4b0a-a926-c96f177dab5d)
我用脚本评分后,出现的分数为零,请问这是什么情况呢?下面是我的评分代码:
import torch
from transformers import AutoModel, AutoTokenizer
mode…
-
llama2_7b_qlora_alpaca_enzh_e3.py作为模板,qlora微调gsm8k,修改PROMPT_TEMPLATE.llama2_chat为PROMPT_TEMPLATE.llama3_chat,acc从62下降到28,可能是什么原因导致的?
-
### Please check that this issue hasn't been reported before.
- [X] I searched previous [Bug Reports](https://github.com/OpenAccess-AI-Collective/axolotl/labels/bug) didn't find any similar reports.
…
-
1. Nomal float + Double quantization
QLoRA currently uses zero shot quantization which is different from GPTQ. However, unlike GPTQ, it does not require data, but incurs some performance loss. Theref…
-
Qlora LLaMa 13B
```
File "/home/hysz/anaconda3/envs/qlora/lib/python3.10/site-packages/torch/optim/lr_scheduler.py", line 69, in wrapper
return wrapped(*args, **kwargs)
File "/home/hysz/…
-
We try to implement 4bit-qlora, thanks to the optimized kernel implementation of back-propagation, the fine-tuning speed is similar to 8-bit lora at present. Welcome to use and issue: https://github.c…