-
### Describe the issue
Unexpectedly training with the SGD optimizer is slower than training with the AdamW optimizer. By profiling with Nsight Systems I found out that the SGD optimizer copies appr…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Feature Description
The work I propose involves:
Implementing key optimizers such as Stochastic Gradient De…
-
I've been working on MRI segmentation tasks using nnU-Net, and I've noticed that the standard configurations often utilize SGD as the optimizer. While I understand that the choice of optimizer and n…
-
I guess the optimizer choice should be either LARS or LAMB based on the encoder (convolutional/transformer). Isn't it?
-
**File:** train.py
**Function:** train_one_epoch(...)
#### Description:
Assuming reduction method set to `'mean'`(_default_) for loss function and `drop_last` set to `False`(_default_) for `DataL…
-
-
I've looked through `schedule_1x.py` for learning rate schedule.
It looks very similar to on of MMengine default suggestions (https://mmengine.readthedocs.io/en/latest/tutorials/param_scheduler.htm…
-
Hi,
I am trying to run the code time series prediction sgd but I got an error "optimizer got an empty parameter list". I googled and found we need to register some parameters. Do you have an updated…
-
## Describe the bug
Currently FedAvg does not perform well even in homogeneous settings for models with batchnorms such as resnet18. The accuracy over epochs curve is highly erratic and does not reac…
-
Platforms: linux
This test was disabled because it is failing in CI. See [recent examples](https://hud.pytorch.org/flakytest?name=test_grad_scaling_autocast_fused_optimizers_SGD_cuda_float32&suite=Te…