gradient-accumulation Search Results

1000+ results
for gradient-accumulation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

TheLastBen/fast-stable-diffusion #1552

Returned non-zero exit status 1 problem with Kohya Lora trai…

`Traceback (most recent call last): File "C:\ai\stable-diffusion-webui\Kohya\kohya_ss\train_network.py", line 539, in train(args) File "C:\ai\stable-diffusion-webui\Kohya\kohya_ss\train_ne…

CreamyOcean updated 1 year ago
1
modelscope/ms-swift #1507

failed (exitcode: -11) local_rank: 5 (pid: 11514) of binary:…

**Describe the bug** 再进行多机lora微调时出错： failed (exitcode: -11) local_rank: 5 (pid: 11514) of binary: /home/jovyan/data-ws-enr/zconda/envs/swift_ft/bin/python Traceback (most recent call last): File…

shyzzz521 updated 2 months ago
8
VectorSpaceLab/OmniGen #59

Saving of EMA state dict in train.py

Hi, can I check if this is a typo in the training script? ``` if ema_state_dict is not None: checkpoint_path = f"{checkpoint_dir}/{int(train_steps/args.gradient_accumulation_steps):07d}_ema" …

brycegoh updated 2 days ago
2
microsoft/DeepSpeed #4931

[BUG]deepspeed zero3 gets error in dist.get_rank() in multip…

**Describe the bug** deepspeed zero3 gets error in dist.get_rank() in multiple node and multiple gpu it is perfectly fine when setting to stage 2 transformers: v.4.36.0 accelerate: v.0.26.0 dee…

janenie updated 9 months ago
2
pytti-tools/pytti-book #26

[TUTORIAL] CLIP-guided "style-transfer" techniques

dmarx updated 2 years ago
1
pytorch/pytorch #91879

ddp vs fsdp

### 🐛 Describe the bug I used fsdp+ShardedGradScaler to train my model. Compared with apex. amp+ddp, the precision of my model has decreased. The ddp is like ``` model, optimizer = amp.initial…

chexiangying updated 1 year ago
9
FourthBrain/Building-with-Instruction-Tuned-LLMs-A-Step-by-Step-Guide #2

BloomForSequenceClassification' does not have a lm_head ... …

re the notebook :✉️ MarketMail AI ✉️ Fine tuning BLOOMZ (Completed Version).ipynb https://colab.research.google.com/drive/1ARmlaZZaKyAg6HTi57psFLPeh0hDRcPX?usp=sharing i tried to modify the exa…

maadnfritz updated 1 year ago
1
huminghao16/MTMSN #11

How to evaluate a pretrained model?

@huminghao16 Could you include the scripts for evaluating a pretrained model? (for example evaluating the large model you have included in the readme.) I am running this command: ``` export …

danyaljj updated 3 years ago
5
SpongebBob/Finetune-ChatGLM2-6B #19

我只有200多条多轮对话的数据，去做全参微调能有效果吗？

一下是我的参数 LR=6e-6 DATE=0704 EPOCH=2 MAX_LEN=1024 MASTER_PORT=8888 deepspeed --num_gpus=8 --master_port $MASTER_PORT main.py \ --deepspeed deepspeed.json \ --do_train \ --do_eval \ …

ymmbb8882ymmbb updated 1 year ago
2
piergiaj/pytorch-i3d #44

Why do you reset the loss every tenth scheduler step, and st…

Forgive me if the answer is obvious, but I am using this pytorch implementation with my own data and am confused what the purpose is of a few lines of code in train_i3d.py are doing. The optimizer …

tym0027 updated 4 years ago
2

上一页 1...89 90 91 92 93 94 95...100 下一页

1000+ results for gradient-accumulation

1000+ results
for gradient-accumulation