Update the autograd idioms to work with JVPs which is more efficient (though the current tanh implementation is pretty terrible, as it needlessly computes a full jacobian.)
Working with the JVP is also conceptually more straightforward: compute the jacobian then matrix multiply with the incoming jacobian/gradient/scalar.
If you want you can optimise this, or not.
Update the autograd idioms to work with JVPs which is more efficient (though the current tanh implementation is pretty terrible, as it needlessly computes a full jacobian.) Working with the JVP is also conceptually more straightforward: compute the jacobian then matrix multiply with the incoming jacobian/gradient/scalar. If you want you can optimise this, or not.