gradient-accumulation Search Results

1000+ results
for gradient-accumulation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Natooz/MidiTok #198

KeyError in collators.py line 164

Hi, I'm training on the huge bread-midi-dataset. Enumerating the dataloader from a DatasetJSON throws a KeyError in collators.py line 164: length_of_first = batch[0].size(0) when the batch is emp…

kroll-software updated 1 week ago
1
pytorch/xla #3593

OOM when running Gradient Accumulation

## 🐛 Bug When I write my own (single-core, haven't tested multi-core yet!) loop for running a PyTorch model training with gradient accumulation on TPUs, I get an OOM error when running with gradien…

siddk updated 2 years ago
3
huggingface/trl #2167

fine-tuning llama 3.2

### System Info im use python 3.10 and last version of all libraries on 04.10.2024, and i try to run: trl sft --model_name_or_path meta-llama/Llama-3.2-3B --dataset_name Vikhrmodels/GrandMaster-PRO…

AugustLigh updated 2 days ago
3
Lightning-AI/pytorch-lightning #19354

Support `DDP(static_graph=True)` and gradient accumulation

I got `SystemError: returned NULL without setting an error` when setting **accumulate_grad_batches = 2**. But I see nothing helpful in the log. Error gone when changing `DDPStrategy(static_graph=F…

nousr updated 4 months ago
3
unslothai/unsloth #1003

Problem with Phi 3.5

Hello Guys, i wanted to try your version of Phi-3.5-mini-instruct with the DPO Trainer from Huggingface. But when i run the Training i get *NaN or Inf found in input tensor.* Same code wor…

DRXD1000 updated 3 weeks ago
2
IDEA-Research/MaskDINO #107

Is there any provision for training models with gradient acc…

Support for Gradient accumulation for lower batch size to accommodate large size images in single 16gb GPU?

aman0044 updated 3 months ago
1
axolotl-ai-cloud/axolotl #1924

Using two 8xH100 nodes to train. encounter error bf16 req…

### Please check that this issue hasn't been reported before. - [X] I searched previous [Bug Reports](https://github.com/axolotl-ai-cloud/axolotl/labels/bug) didn't find any similar reports. ### Exp…

michaellin99999 updated 1 week ago
7
instructlab/instructlab #2103

Try gradient accumulation and checkpointing for Linux traini…

https://huggingface.co/docs/transformers/v4.38.2/perf_train_gpu_one#gradient-accumulation In the `TrainingArguments` passed to `SFTTrainer`, we can likely reduce the total GPU memory required to tr…

bbrowning updated 1 month ago
13
Lightning-AI/pytorch-lightning #17446

sync_batchnorm with gradient accumulation

### Description & Motivation There is an easy way to do gradient accumulatation on lighting, but as I understand the batch norm is problematic since it's calculated every forward pass. We should fix…

YoniChechik updated 1 year ago
2
microsoft/DeepSpeed #5410

[BUG] Gradient Accumulation Steps Initialization Bug in Pipe…

**Describe the bug** I reviewed the initialization of self.gradient_accumulation_steps in the DeepSpeedConfig module when only train_batch and micro_batch are set (deepspeed Version: 0.13.1)： ```p…

fwyc0573 updated 3 months ago
1

上一页 1...4 5 6 7 8 9 10...100 下一页

1000+ results for gradient-accumulation

1000+ results
for gradient-accumulation