Oliver/switch to computing jacobian vector product

Update the autograd idioms to work with JVPs which is more efficient (though the current tanh implementation is pretty terrible, as it needlessly computes a full jacobian.) Working with the JVP is also conceptually more straightforward: compute the jacobian then matrix multiply with the incoming jacobian/gradient/scalar. If you want you can optimise this, or not.