micro-batches Search Results

1000+ results
for micro-batches

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

josStorer/RWKV-Runner #276

您好，大佬，打扰一下，一直卡在这一步

聊天续写作曲配置模型下载训练设置关于 LoRA微调 WSL 8 768 blocks.3.ffn.key.lora_A 3072 8 blocks.3.ffn.key.lora_B 768 768 blocks.3.ffn.receptance.weight 8 768 blocks.3.ffn.recep…

ab18108289 updated 9 months ago
1
hpcaitech/Titans #53

[BUG] Unreasonable memory comsumption

### 🐛 Describe the bug Creating an TransformerEncoder causes memory overflow, but the same config works with the huggingface `transformers` module. ```python # config.py from colossalai.amp import…

ZhiYuanZeng updated 1 year ago
1
pytorch/xla #6347

[RFC] Pipeline parallelism for Pytorch/XLA

## 🚀 Description Pipeline parallelism is a technique used in deep learning model training to improve efficiency and reduce the training time of large neural networks. Here we propose a pipeline paral…

YangFei1990 updated 2 months ago
31
libuv/libuv #1027

Improve documentation

Some people (even seasoned developers) get thrown off by our documentation, let's try to fix that! - [ ] Add descriptions of what each funcion does, aside from linking the man page - [ ] Add an examp…

saghul updated 5 months ago
31
Lightning-AI/pytorch-lightning #17393

Automatic gradient accumulation for batch size tuning

### Description & Motivation When training different model sizes on a different number of devices or different hardware, the batch size needs to be carefully tuned in order to achieve maximum GPU u…

bilelomrani1 updated 11 months ago
6
Blealtan/RWKV-LM-LoRA #16

问题已解决：cpu+fp32运行chat.py时报错RuntimeError: "addmm_impl_cpu_" no…

在cpu+fp32推理时遇到下面的报错： > File "D:\gitpro\RWKV-LM-LoRA\RWKV-v4neo\src\model_run.py", line 67, in __init_ > w[k] += w[lora_B] @ w[lora_A] * (args.lora_alpha / args.lora_r) > RuntimeError: "…

ChinesePainting updated 1 year ago
1
huggingface/nanotron #221

error resuming from checkpoint if PP > 1

After running the toy example I run it again to resume training and I'm getting an error only if PP > 1 Here's the config: ```yaml checkpoints: checkpoint_interval: 25 checkpoints_path: …

moussaKam updated 1 month ago
2
Lightning-AI/lit-llama #286

Can not finetune, OOM on v100 with batchsize 6

I changed batch size to this: ``` # batch_size = 128 batch_size = 6 micro_batch_size = 2 gradient_accumulation_steps = batch_size // micro_batch_size max_iters = 50000 * 3 // micro_batch_size …

lucasjinreal updated 1 year ago
6
kakaobrain/torchgpipe #9

convergence problem

checkpoint = (j < checkpoint_stop) if checkpoint: chk = Checkpointing(partition, batch) task = Task(streams[i], compute=chk.chec…

yangpc615 updated 4 years ago
6
pytorch/pytorch #44827

[RFC] Pipeline Parallelism in PyTorch

# Introduction As machine learning models continue to grow in size (ex: OpenAI GPT-2 with 1.5B parameters, OpenAI GPT-3 with 175B parameters), traditional [Distributed DataParallel](https://pytorch…

pritamdamania87 updated 2 years ago
1

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for micro-batches

1000+ results
for micro-batches