distributed-optimization Search Results

1000+ results
for distributed-optimization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

kohya-ss/sd-scripts #629

[SDXL] save_file bug on DeepSpeed

Hi, I met a bug on save_file when I used DeepSpeed. How should I fix it? ``` override steps. steps for 1 epochs is / 指定エポックまでのステップ数: 18165 [2023-07-09 14:35:23,072] [INFO] [logging.py:96:log_…

alfredplpl updated 1 year ago
3
huggingface/diffusers #9973

"ValueError: Attempting to unscale FP16 gradients" for train…

### Describe the bug when I was training dreambooth lora sdxl script on dag dataset, it output the errors as following: ValueError: Attempting to unscale FP16 gradients. ### Reproduction export MO…

zyf2316 updated 4 days ago
1
lshqqytiger/stable-diffusion-webui-amdgpu #521

[Bug]: AMD 7900 XTX wont start

### Checklist - [ ] The issue exists after disabling all extensions - [X] The issue exists on a clean installation of webui - [ ] The issue is caused by an extension, but I believe it is caused by a …

zarigata updated 2 months ago
8
jorenham/scipy-stubs #129

The great `types-scipy-sparse` merger 🎉

@BarakKatzir has been developing the [`types-scipy-sparse`](https://github.com/BarakKatzir/types-scipy-sparse) stub package for `scipy.sparse`. A large portion appears to more complete that the `scipy…

jorenham updated 2 weeks ago
10
microsoft/DeepSpeed #284

w/o model-parallel usability numbers reproduce

I've been using DeepSpeed successfully with my large model train jobs. But [this](https://www.microsoft.com/en-us/research/blog/zero-2-deepspeed-shattering-barriers-of-deep-learning-speed-scale/) blog…

g-karthik updated 3 years ago
31
pytorch/pytorch #69431

[feature request] `torch.to(obj, device)` supporting recursi…

Often it is needed to move model results to cpu (or inputs to gpu). Once the data structures get a bit complicated, dicts and lists appear often in model results. Often we have to roll a little utilit…

vadimkantorov updated 2 months ago
29
dask-contrib/dask-awkward #138

Task graphs for record arrays with systematic variations or …

Starting a new topic since there is a concrete example: You'll need to install coffea from this branch: `https://github.com/CoffeaTeam/coffea/tree/awkward2_dev` (pip install -e '.[dev]') You'll ne…

lgray updated 1 year ago
12
facebookresearch/fairseq #4591

How to use multi-GPU

## ❓ Questions and Help ### Before asking: 1. search the issues. 2. search the docs. #### What is your question? I want to do fine tuning of the BART summary model. The machine spec I'…

krcc5978 updated 2 years ago
3
Lightning-AI/pytorch-lightning #20299

Incosistant memory usage comparing to huggingface trainer wh…

### Bug description I was able to fine-tune a 8B LLM using Huggingface training framework with PEFT+DeepSpeed stage 2 under fp16 precision(mixed precision training). Recently I would like to change…

mickeysun0104 updated 1 month ago
5
tensorflow/autograph #2

WARNING:tensorflow:Entity <function Function._initialize_uni…

The attachments contain details of the warning that I encountered while working on some dataset. Kindly review it. In case of bug, fix it. WARNING:tensorflow:Entity could not be transformed and wi…

Victor-99 updated 11 months ago
8

上一页 1...87 88 89 90 91 92 93...100 下一页

1000+ results for distributed-optimization

1000+ results
for distributed-optimization