adamw Search Results - Githubissues

1000+ results
for adamw

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/pytorch #38853

Weight decay in AdamW

## 🐛 Bug As the original paper(https://arxiv.org/pdf/1711.05101.pdf, green boxes) shows the formula of applying weight decay to Adam should be `\theta_t = (1 - \lambda) * \theta_{t - 1}` …

jeongukjae updated 4 years ago
3
unslothai/unsloth #1163

Gradient accumulation fix does change the max_steps value

when i set the max_steps property in TrainingArguments to a number N, i see in the train logs that it iterates untill 2*N, which was not the case when doing trainer.train(). I will look further if th…

tristan279 updated 1 week ago
1
huggingface/transformers #31867

Transformer.Trainer fails in creating optimizer for optim ad…

### System Info - `transformers` version: 4.42.3 - Platform: Linux-5.15.0-107-generic-x86_64-with-glibc2.35 - Python version: 3.10.12 - Huggingface_hub version: 0.23.4 - Safetensors version: 0.4.…

princethewinner updated 2 weeks ago
6
softwaremill/tapir #3902

Implement multipart body support in sttp stub

Multipart support was never added to the stub: https://github.com/softwaremill/tapir/blob/abeb5d72e4e7a16c4da3830a59eb58862dfda69b/server/sttp-stub-server/src/main/scala/sttp/tapir/server/stub/SttpReq…

urieli updated 2 days ago
9
hustvl/Matte-Anything #6

omegaconf.errors.ConfigAttributeError: Missing key AdamW

omegaconf.errors.ConfigAttributeError: Missing key AdamW full_key: AdamW object_type=dict

watertianyi updated 1 year ago
1
pytorch/pytorch #108491

Suppport Fused AdamW on CPU

### 🚀 The feature, motivation and pitch I would like to benefit from the speed advantages of fused-adamw while doing CPU only training, but this is not supported. It currently throws an error indicat…

Epliz updated 11 months ago
3
maybeLx/MVSFormerPlusPlus #22

finetune using our custom data

we use our data to finetune with pretrained dtu model, our data used the same format as DTU. but we get some error when training -- Process 0 terminated with the following error: Traceback (mos…

mawenjie1111 updated 3 months ago
1
eminorhan/optimized-mae #1

finetune not working (fused AdamW error)

`RuntimeError: params, grads, exp_avgs, and exp_avg_sqs must have same dtype, device, and layout`

eminorhan updated 8 months ago
1
jax-ml/jax #22436

CTRL+C Broken When Running distributed.initialize() on one T…

## Issue Encountered a deadlock while running a JAX-based LLM training script on a TPU-v4-32 pod. SSH'd into worker 0 and ran the script there directly, instead of using `--worker all --command "..."…

s-smits updated 3 months ago
4
OptimalScale/LMFlow #893

How can I use other optimizers in LISA?

I tried to pass --optim , nothing happens. How can I use optimizers such as adamw_8bit or adafactor in LISA? They are not in the custom_optimizers either.

greatzane updated 2 months ago
1

上一页 1...3 4 5 6 7 8 9...100 下一页

1000+ results for adamw

1000+ results
for adamw