-
### System Info
- `transformers` version: 4.40.0.dev0
- Platform: Linux-5.15.0-101-generic-x86_64-with-glibc2.17
- Python version: 3.8.2
- Huggingface_hub version: 0.20.2
- Safetensors version: 0…
-
尝试在12G卡上训练 python qlora.py --model_name="chinese_alpaca" --model_name_or_path="./model_hub/chinese-alpaca-7b" --trust_remote_code=False --dataset="msra" --source_max_len=128 --target_max_len=64 --do_t…
-
qlora微调模型输出正确结果后仍然会输出一些不相干的内容,例如:
![image](https://github.com/yangjianxin1/Firefly/assets/59114904/e3b50b77-165b-4757-b4eb-0a6349ec1f12)
我使用的断句功能,但是他在断完句子后仍然会输出一些无关紧要的内容,我一开始以为是训练集的大小太小,于是我将训练集的大小从2…
-
I notice that there are some differences compared to the `artido/qlora` repo. Why were the following code left out in this repo?
```py
def find_all_linear_names(args, model):
cls = bnb.nn.Lin…
-
Currently, Unsloth can only support single GPU training, how can you implement it with 8-GPU training? Thx
-
In the given examples axoltol [exmaples/medusa](https://github.com/ctlllll/axolotl/tree/main/examples/medusa),
I follow the `vicuna_7b_qlora_stage1.yml` and `vicuna_7b_qlora_stage2.yml` to write my …
-
#### The inference code in `inference.ipynb` is taking 3minutes to run on Colab L4 GPU .Is their any way to speed up inference?
@swastikmaiti
-
Fine tune flan t5 xl and flan t5 xxl using qlora and has a problem with learning rate 0.0 and loss 0.0 ? Can anyone resolve this problem ? Thanks
-
I added a 4-bit load after the command LoRA training with ZeRO-3 on two or more GPUs to achieve a mix of QLoRA and ZeRO-3. But the program encountered the following error:
RuntimeError: expected ther…
-
To enable efficient training on GPUs and scale our repository for models with millions to billions of parameters—essential for working with large visual language models—we must implement optimization …