Closed hallvardnmbu closed 2 months ago
Look into optimizer steps (and other occurencies), and combine into single operation. Currently "duplicated" looping:
X.mul_scalar_inplace(self.beta1); X.add_inplace(&gradients.mul_scalar(1.0 - self.beta1));
Look into optimizer steps (and other occurencies), and combine into single operation. Currently "duplicated" looping:
X.mul_scalar_inplace(self.beta1); X.add_inplace(&gradients.mul_scalar(1.0 - self.beta1));