kostaleonard / great-model-theory

A deep learning library for Scala.
MIT License
2 stars 0 forks source link

Consider ways to improve the numerical stability of DifferentiableFunctions like Reciprocal #76

Open kostaleonard opened 1 year ago

kostaleonard commented 1 year ago

As a machine learning engineer, I want my neural network training to be numerically stable so that I'm not surprised with errors and other unexpected behavior at prediction time.

A cursory look at autograd suggests that the automatic differentiation mechanism is not responsible for numerical stability. I'm not even sure TensorFlow does anything to guarantee numerical stability. It might just be the user's responsibility when they're writing custom layers. In practice, this isn't too big of a problem because the only layer that uses Reciprocal is Sigmoid, which has no discontinuities.