sail-sg Adan issues - Githubissues

sail-sg / Adan

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

Apache License 2.0

744 stars 63 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

如何设置Adan学习率

#48 theFoxofSky closed 2 months ago
3
Install Error

#47 xv994 closed 5 months ago
4
RuntimeError: The detected CUDA version (12.2) mismatches the version that was used to compile PyTorch (11.8).

#46 trungpx closed 5 months ago
2
Fix CUDAExtension bug in setup.py

#45 AlexwellChen closed 6 months ago
0
About the pre-trained model

#44 casiatao closed 2 months ago
1
Settings for instruction-tuning

#43 KaiLv69 opened 8 months ago
2
在我的cnn模型中，lr=0.01时，在20-30epoch，map可以提升的很快但是后续会成为NAN。但是如果使用0.001不会直接为NAN，但是效果不好，请问这个现象代表着什么问题？谢谢！

#42 liiicon closed 2 months ago
4
How to implement Adan optimizer in Yolov7?

#41 karan16mehta closed 9 months ago
1
Concrete weight decay configuration for GPT-2 pretraining

#40 DesperateExplorer closed 2 months ago
1
Adan相比于SGD在前 74 epochs保持领先，但是后续收敛变慢，我改如何调整lr等超参数？

#39 liiicon closed 9 months ago
2
Handle empty parameter list

#38 janEbert closed 1 year ago
0
Restarting strategy

#37 janEbert closed 1 year ago
4
Deepspeed Integration

#36 pUmpKin-Co closed 1 year ago
4
Gradient clipping option in DeepSpeed

#35 DesperateExplorer closed 1 year ago
1
module 'fused_adan' has no attribute 'adan_multi_tensor'

#34 76586 closed 1 year ago
1
processing data for BERT experiment

#33 kenoharada closed 2 months ago
4
GPU type and GPU nums and total training time on Transformer-XL, GPT-2

#32 kenoharada closed 1 year ago
2
Update unfused install

#31 AlexwellChen closed 1 year ago
0
Allow unconditional CUDA build

#30 janEbert closed 1 year ago
0
Some questions about learning rate.

#29 stella-von closed 1 year ago
7
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

#28 MichaelMonashev closed 1 year ago
12
[Feature] Fused kernel for Adan optimizer

#27 AlexwellChen closed 1 year ago
0
HumanEval shall not be used for training.

#23 aviczhl2 closed 1 year ago
1
Add `setup.py`

#22 janEbert closed 1 year ago
3
Suggestions for applying to visual dense prediction tasks.

#21 pUmpKin-Co closed 1 year ago
6
Training yolov5 model appears nan

#20 xialuxi closed 1 year ago
6
valueError: not enough values to unpack (expected 3, got 2)

#19 myseverus closed 1 year ago
2
Whether it is applied to the training of GAN network？

#18 SHNsunhenan opened 1 year ago
2
Some questions in step function

#17 RookieXwc closed 1 year ago
3
About the convergence trend comparison with Adamw in ViT-H

#16 haihai-00 opened 1 year ago
3
Typo in the paper

#15 Tomarchelone closed 1 year ago
1
why there is no sgd-style implementation？

#14 brisker closed 1 year ago
8
Embedding tensors/weight update unsupported

#13 DenisVorotyntsev closed 1 year ago
5
`no_prox` Flag

#12 Zach-ER closed 1 year ago
5
Add closure function in step for compatibility

#11 CookieLau closed 1 year ago
1
add multi_tensor

#10 bonlime closed 1 year ago
3
The BERT finetuning get_data file error?

#9 NoahDDavis closed 1 year ago
4
\epsilon not implemented as in the paper

#8 Zach-ER closed 1 year ago
1
block: [0,0,0], thread: [96,0,0] Assertion `input_val >= zero && input_val <= one` failed.

#7 lucasjinreal closed 2 years ago
3
Is there a TensorFlow/Keras implementation?

#6 cmsflash closed 1 year ago
7
remove redundant update calculation, and unused import

#5 lessw2020 closed 2 years ago
2
Beta values are not same

#4 JaheimLee closed 2 years ago
1
Step 2 of Usage

#3 richinex closed 2 years ago
1
fix minor typo

#2 xk-huang closed 2 years ago
0
`torch._foreach...` implementation

#1 bonlime closed 1 year ago
2