Present a simple module capable of learning arithmetic functions such as add, sub, mult, div, etc. And can generalize well on unseen data or unseen inference scheme.
DNNs with Non-linearities Struggle to Learn Identity Function
Train an autoencoder to reconstruct its input ranged [-5, 5].
All autoencoders are identical in its parameterization (3 hidden layers of size 8), only using different non linearities.
Trained with MSE loss.
Tested in [-20, 20], the error increase severely both below and above the range of numbers seen during training.
The Neural Accumulator (NAC) & Neural Arithmetic Logit Unit (NALU)
NAC: A special case of linear layer, whose weight matrix W only consists of {-1, 0, 1}, defined as:
W = tanh(\hat{W}) * σ(\hat{M})
The elements of W are guaranteed to be [-1, 1], and biased towards {-1, 0, 1} during learning, since {-1, 0, 1} corresponds to the saturation points of either tanh(.) or σ(.)
Its output are additions or subtractions of rows in the input vector.
NALU: Learns a weighted sum between two sub-cells:
One is the original NAC, capable of learning to add and subtract.
The other one operates in log space, capable of multiply and divid, e.g., log(XY) = logX + logY; log(X/Y) = logX - log Y; exp(log(X)) = X
Altogether, NALU can learn to perform general arithmetic operations.
Can handle either add/subtract or mult/div operations but not a combination of both.
For mult/div operations, it cannot handle negative targets as the mult/div gate output is the result of an exponentiation operation which always yeilds positive results.
Power operations are only possible when the exponent is in the range of [0, 1].
Metadata