Open zc12345 opened 1 year ago
inspiration:
def train(weight, gradient, momentum, lr): update = interp(gradient, momentum, β1) update = sign(update) momentum = interp(gradient, momentum, β2) weight_decay = weight * λ update = update + weight_decay update = update * lr return update, momentum
ImageNet SOTA
inspiration:
BASIC
Model soups
Lion Optimizer