-
### Please check that this issue hasn't been reported before.
- [X] I searched previous [Bug Reports](https://github.com/axolotl-ai-cloud/axolotl/labels/bug) didn't find any similar reports.
###…
-
训练的时候提示如下错误:
```
(venv) [xinjingjing@dev-gpu-node-09 InstructGLM]$ python train_lora.py \
> --dataset_path data/belle \
> --lora_rank 8 \
> --per_device_train_batch_size 2 \
> --…
-
### 🐛 Describe the bug
The error happens in booster.backward(loss, optimizer), I used GeminiPlugin
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 0 (pid: 2…
-
Thanks for creating these scripts! Working with some tooling that reads in metadata included with the embedding files. Using this metadata to create charts and graphs for the embeddings. See https://g…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
- `llamafactory` version: 0.9.1.dev0
- Platform: Linux-5.4.119-1-tlinux4-0010.3-x86_64-with-glibc2.38
-…
-
### Question
Thank you for your work.I used 8xv100 32gb,94 cpu and 364gb memory.
#!/bin/bash
################## VICUNA ##################
PROMPT_VERSION=v1
MODEL_VERSION="vicuna-v1-3-7b"
#####…
-
使用lora训练llama-2-7b-chat,数据是博主data目录下的数据,代码并无改动,启动的脚本用的是finetune_lora.sh,deepspeed启动命令如下:
deepspeed --include localhost:0 finetune_clm_lora.py \
--model_name_or_path /home/zengshaokun/models/llam…
-
I can only set burst_size=4.
If burst_size is set 8 or 16, it will be out of memory right?
Is that because the model is too big?
-
Dear author @yunkchen
Thanks for your awesome work!
I tried to run the lora training using my data, but the speed is very slow --- ~40s/it.
Training details:
512x512 model
2 GPUs - batch s…
-
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/mdisk2/tanjunwen/gitprj/mChatGLM_mutli_gpu_tuning/finetune_ptuning.py:34 in │
│ …