google / tf-quant-finance

High-performance TensorFlow library for quantitative finance.
Apache License 2.0
4.57k stars 581 forks source link

American options greeks with PDE is slow #67

Closed arthurpham closed 2 years ago

arthurpham commented 2 years ago

I haven't found an example of how to properly calculate the greeks with the PDE pricer or the MC pricer. Here is my attempt : https://colab.research.google.com/github/arthurpham/google_colab/blob/1f737238f1ba71c8c84c47bb24f55e3a97688d1f/AmericanOption_PDE_Greeks_TQF.ipynb I checked that the greeks with PDE was close to the BS greeks (BS closed form formula and AD). But the performance of the greeks with PDE was very low, i'm not sure why. Also i get memory exhaustion on my Google Colab when i try to run a batch larger than 10 options.

What am i missing ? Thank you

TQF GPU 10 options wall time: 26.640211820602417 options per second: 0.3753723907054831

cyrilchim commented 2 years ago

Hi Arthur,

For PDE, AD is expected to take longer than finite difference because it involves solving tridiagonal system of equations (see, e.g., matrix inverse gradient). So heuristically, I'd expect gradient to take at least 3x pricing time in this case. (I remember getting roughly this result). If you want to compute sensitivities for all grid points then the AD is definitely doing it better than the finite difference as it can do so in one go. There is extensive research on how to compare performance of AD vs finite difference.

Now ,coming back to your colab. The issue here is that you are not using tf.function for gradient calculation which makes your gradient calculation run in eager mode. You compute gradient as

 tff.math.fwd_gradient(lambda x: price_fn(spot=x), tf_volatility)

Try instead

 @tf.function
 def fn(tf_volatility):
  return tff.math.fwd_gradient(lambda x: price_fn(volatility=x), tf_volatility)

Keep in mind, first function call compiles the function so it will be slower.

Also, keep in mind, you are running fwd_gradient which is not native for TF (because of the while_loop). It is better to try stick with backward gradients where possible. In your case it should give the same result. We have a notebook explaining the difference between the two.

@tf.function
def g(tf_volatility):
  return tff.math.gradients(lambda x: price_fn(volatility=x), tf_volatility)

I tried it quickly, and it seems that g computes 2x faster than fn.

As for memory issues, it simply means there are tensors of large dimensions.You'd need to inspect graph with Tensorboard or directly via

g.get_concrete_function(tf_volatility).graph.as_graph_def()

to understand what is going on directly. I can see tensors of shape [10, 3, 1022] not present in the graph of price_fn, which I think appears in gradient calculation of the tridiagonal op.

Hope this helps.