Open siddharth9820 opened 1 year ago
https://github.com/jettify/pytorch-optimizer/blob/910b414565427f0a66e20040475e7e4385e066a5/torch_optimizer/shampoo.py#L130 Shouldn't the second argument be -0.5/order? For example, with order 2, the authors raise the precondition matrices to the -1/4th power.
-0.5/order
https://github.com/jettify/pytorch-optimizer/blob/910b414565427f0a66e20040475e7e4385e066a5/torch_optimizer/shampoo.py#L130 Shouldn't the second argument be
-0.5/order
? For example, with order 2, the authors raise the precondition matrices to the -1/4th power.