gradient-centralization Search Results

58 results
for gradient-centralization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

jettify/pytorch-optimizer #244

Wrong paper references for Ranger optimizer variants

The README lists [Calibrating the Adaptive Learning Rate to Improve Convergence of ADAM](https://arxiv.org/abs/1908.00700v2) by Tong, Liang, and Bi (2019) as the source paper accompanying the `Ranger`…

jwuphysics updated 3 years ago
1
varunranga/zorb-numpy #1

questions

Hi, I'm not an expert but I have a few questions: While ZORB is impressive performance wise, how much can variate the accuracy gap vs Adam?Extensive testing is needed. Can we port transformers suc…

LifeIsStrange updated 3 years ago
1
lessw2020/Ranger-Deep-Learning-Optimizer #32

Gradient centralization was updated

https://github.com/Yonghongwei/Gradient-Centralization/commit/d46e4c54ae47b730d0805694849f106c41828e97

hadaev8 updated 4 years ago
4
juntang-zhuang/Adabelief-Optimizer #24

Epsilon is important to Adaptive Optimizer

Hi~ https://github.com/juntang-zhuang/Adabelief-Optimizer/issues/18#issue-729329117 Since I asked you question last time, I've done a series of experiments. I think both methods of determining the …

yuanwei2019 updated 3 years ago
1
official-stockfish/nnue-pytorch #87

Training on a system with no GPU

Hello, Thank you for creating a nice project for nnue training in pytorch! I am trying to use your project to create a network for Igel. I wanted to ask you if it is possible to have trainer in …

vshcherbyna updated 3 years ago
6
LiyuanLucasLiu/RAdam #54

RAdam Instability vs AdamW / Adam

Late to the party, but once again good work to you all @LiyuanLucasLiu ! So I was testing RAdam vs AdamW on simple linear models [ie Logistic Regression / Linear Regression]. Obviously for these sm…

danielhanchen updated 4 years ago
8
mgrankin/over9000 #20

Update benchmarcks now that Ranger supports gradient central…

https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer

LifeIsStrange updated 4 years ago
1
Yonghongwei/Gradient-Centralization #5

Should i use pytorch gradient clippping with gradient centra…

hadaev8 updated 4 years ago
2
official-stockfish/nnue-pytorch #36

RuntimeError: Pinned memory requires CUDA

Hello, Thanks for very interesting project and contributing to NNUE training. I am trying to use the trainer for Igel and when running the test command: ``` python train.py total_3m_d14.bin …

vshcherbyna updated 3 years ago
10
tensorflow/models #7843

Relation to the original XLNet implementation?

Hi @saberkun, @zihangdai, @graykode, @bzantium The original [zihangdai/XLNet](https://github.com/zihangdai/xlnet) repository doesn't get any update recently. Should we assume that the XLNet impleme…

vochicong updated 4 years ago
2

上一页 1...1 2 3 4 5 6...6 下一页

58 results for gradient-centralization

58 results
for gradient-centralization