gradient-accumulation Search Results

1000+ results
for gradient-accumulation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

llvm/llvm-project #98449

Gradient of an accumulation obtained by a for-loop

The code ``` double f(double const* const x, std::size_t const n) { double acc{}; for (std::size_t i = 0; i < n; ++i) acc += x[i] * x[i]; return acc; } int main() { auto g = clad:…

de-sholl updated 2 months ago
1
keras-team/tf-keras #107

Gradient accumulation support?

**Describe the feature and the current behavior/state:** Gradient accumulation is extremely useful when working with large images/volumetric data, using low-end hardware, or training on multiple GP…

andreped updated 8 months ago
28
pytorch/torchtune #1734

Multiple GPU low performance

Hello, I have an issue with multiple GPU performance. - I use the recipe `lora_finetune_single_device` with the config `mini_lora_single_device.yaml` on 6000ADA, I got ~5it/s - I use the recipe `lo…

jetstudio-io updated 4 days ago
3
huggingface/diffusers #9393

deepspeed train flux1 dreambooth lora can not save model

### Describe the bug when I run the script train_dreambooth_lora_flux.py. It raise ValueError: unexpected save model: . something bug in save_model_hook? ![Uploading image.png…]() ### Reproducti…

ldtgodlike updated 3 weeks ago
11
unslothai/unsloth #1019

No Validation Loss logged (possibly related to train_on_resp…

Evaluations are being run, _but no validation loss is logged or sent to WandB_ The console shows that eval is running, but displays a table along the lines of: | eval loss | validation loss | |…

selalipop updated 2 weeks ago
3
awslabs/sagemaker-debugger #426

Compatibility with gradient accumulation

I'm bringing my own PyTorch training script, and I'm interested in using SM Debugger to profile function calls in my training jobs. The [API Glossary](https://github.com/awslabs/sagemaker-debugger/blo…

quasimik updated 3 years ago
1
microsoft/DeepSpeed #758

gradient_accumulation_steps integration

1. `gradient_accumulation_steps` configuration is not documented at all - it's only mentioned in the context of pipeline 2. there are no instructions on how to integrate it with the existing trainer …

stas00 updated 3 years ago
2
XLabs-AI/x-flux #84

num_processes > 1, train_flux_lora_deepspeed failed

# accelerate_config with num_processes == 3 > compute_environment: LOCAL_MACHINE debug: true deepspeed_config: gradient_accumulation_steps: 2 gradient_clipping: 1.0 offload_optimizer_devi…

woshinwyx updated 1 month ago
1
tianyi-lab/Cherry_LLM #24

The training bash script for FastChat is what?

Thank you very much for the work you have brought, which is very helpful for those of us with fewer training resources. I am a newcomer to the field of NLP and am not very familiar with training frame…

daidaiershidi updated 1 week ago
2
mahmoodlab/HIPT #53

Gradient accumulation not properly implemented

Hi, based on the following lines, it seems gradient accumulation is not properly implemented: https://github.com/mahmoodlab/HIPT/blob/a9b5bb8d159684fc4c2c497d68950ab915caeb7e/2-Weakly-Supervised-Su…

clemsgrs updated 9 months ago
2

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for gradient-accumulation

1000+ results
for gradient-accumulation