American options greeks with PDE is slow

Hi Arthur,

For PDE, AD is expected to take longer than finite difference because it involves solving tridiagonal system of equations (see, e.g., matrix inverse gradient). So heuristically, I'd expect gradient to take at least 3x pricing time in this case. (I remember getting roughly this result). If you want to compute sensitivities for all grid points then the AD is definitely doing it better than the finite difference as it can do so in one go. There is extensive research on how to compare performance of AD vs finite difference.

Now ,coming back to your colab. The issue here is that you are not using tf.function for gradient calculation which makes your gradient calculation run in eager mode. You compute gradient as

 tff.math.fwd_gradient(lambda x: price_fn(spot=x), tf_volatility)

Try instead

 @tf.function
 def fn(tf_volatility):
  return tff.math.fwd_gradient(lambda x: price_fn(volatility=x), tf_volatility)

Keep in mind, first function call compiles the function so it will be slower.

Also, keep in mind, you are running fwd_gradient which is not native for TF (because of the while_loop). It is better to try stick with backward gradients where possible. In your case it should give the same result. We have a notebook explaining the difference between the two.

@tf.function
def g(tf_volatility):
  return tff.math.gradients(lambda x: price_fn(volatility=x), tf_volatility)

I tried it quickly, and it seems that g computes 2x faster than fn.

As for memory issues, it simply means there are tensors of large dimensions.You'd need to inspect graph with Tensorboard or directly via

g.get_concrete_function(tf_volatility).graph.as_graph_def()

to understand what is going on directly. I can see tensors of shape [10, 3, 1022] not present in the graph of price_fn, which I think appears in gradient calculation of the tridiagonal op.

Hope this helps.

google / tf-quant-finance

American options greeks with PDE is slow #67