gradient-accumulation Search Results

1000+ results
for gradient-accumulation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

axolotl-ai-cloud/axolotl #1747

Error unfreezing intermediate layers

### Please check that this issue hasn't been reported before. - [X] I searched previous [Bug Reports](https://github.com/axolotl-ai-cloud/axolotl/labels/bug) didn't find any similar reports. ###…

ayushsml updated 3 months ago
2
yanqiangmiffy/InstructGLM #25

ValueError: ChatGLMForConditionalGeneration does not support…

训练的时候提示如下错误： ``` (venv) [xinjingjing@dev-gpu-node-09 InstructGLM]$ python train_lora.py \ > --dataset_path data/belle \ > --lora_rank 8 \ > --per_device_train_batch_size 2 \ > --…

deepeye updated 1 year ago
7
hpcaitech/ColossalAI #5305

[BUG]: booster for booster.backward(loss, optimizer) in sta…

### 🐛 Describe the bug The error happens in booster.backward(loss, optimizer), I used GeminiPlugin ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 0 (pid: 2…

shileims updated 9 months ago
1
kohya-ss/sd-scripts #160

Add additional metadata to Textual Inversion embedding files

Thanks for creating these scripts! Working with some tooling that reads in metadata included with the embedding files. Using this metadata to create charts and graphs for the embeddings. See https://g…

rockerBOO updated 1 year ago
1
hiyouga/LLaMA-Factory #5539

lora微调qwen2.5-math-7b出问题

### Reminder - [X] I have read the README and searched the existing issues. ### System Info - `llamafactory` version: 0.9.1.dev0 - Platform: Linux-5.4.119-1-tlinux4-0010.3-x86_64-with-glibc2.38 -…

lin-dy updated 2 days ago
4
haotian-liu/LLaVA #868

finetune Always stuck

### Question Thank you for your work.I used 8xv100 32gb,94 cpu and 364gb memory. #!/bin/bash ################## VICUNA ################## PROMPT_VERSION=v1 MODEL_VERSION="vicuna-v1-3-7b" #####…

cxl-ustb updated 10 months ago
2
LlamaFamily/Llama-Chinese #289

训练损失从1.4下降到了0.5，训练5个epoch了，从开始到现在验证集ACCURACY一直是64%

使用lora训练llama-2-7b-chat，数据是博主data目录下的数据，代码并无改动，启动的脚本用的是finetune_lora.sh，deepspeed启动命令如下： deepspeed --include localhost:0 finetune_clm_lora.py \ --model_name_or_path /home/zengshaokun/models/llam…

kunzeng-ch updated 7 months ago
1
Algolzw/EBSR #11

What is the burst_size that can be supported by 1 gpu ?

I can only set burst_size=4. If burst_size is set 8 or 16, it will be out of memory right? Is that because the model is too big?

pokaaa updated 2 years ago
5
aigc-apps/EasyAnimate #90

Why is lora training so slow?

Dear author @yunkchen Thanks for your awesome work! I tried to run the lora training using my data, but the speed is very slow --- ~40s/it. Training details: 512x512 model 2 GPUs - batch s…

ArlenCHEN updated 2 months ago
7
CSHaitao/ChatGLM_mutli_gpu_tuning #4

运行bash ptuning.sh报错 Name Optional is not define

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /home/mdisk2/tanjunwen/gitprj/mChatGLM_mutli_gpu_tuning/finetune_ptuning.py:34 in │ │ …

MathamPollard updated 1 week ago
17

上一页 1...80 81 82 83 84 85 86...100 下一页

1000+ results for gradient-accumulation

1000+ results
for gradient-accumulation