Discovering the Black-Scholes PDE using `pysindy`

QBatista commented 9 months ago

Hi, I am trying to use pysindy to discover the Black-Scholes PDE from simulated data.

I can identify the PDE successfully if I proceed as follows:

Define a spatial grid S and a temporal grid t
Compute the price of an option for each point on the cartesian product of the spatial and temporal grid.
Reparametrize the state space to eliminate dependence on S (as in here )
Fit a SINDy model following examples in the documentation

Code

```python import numpy as np from scipy.stats import norm import pysindy as ps # Parameters μ = 0.1 M = 1 n = 1000 T = 1 S_0 = 100 σ = 0.3 K = 100 r = 0.01 dt = T / n τ = (T - np.linspace(0, T - dt, num=n)).reshape((1, -1)) S = np.linspace(50, 150, num=1000).reshape((-1, 1)) # Black-Scholes model Φ = norm.cdf def bsm(S, K, τ, r, σ): d1 = (np.log(S/K) + (r + σ**2/2)*τ) / (σ*np.sqrt(τ)) d2 = d1 - σ * np.sqrt(τ) return S * Φ(d1) - K * np.exp(-r*τ)* Φ(d2) # Simulate option price V_t = bsm(S, K, τ, r, σ) # Re-parametrize log_S = np.log(S) t = σ ** 2 / 2 * (T - τ) # Define PDE library library_functions = [lambda x: x] library_function_names = [lambda x: x] pde_lib = ps.PDELibrary( library_functions=library_functions, function_names=library_function_names, derivative_order=3, spatial_grid=log_S.ravel(), include_interaction=False ) # Define optimizer optimizer = ps.STLSQ(threshold=0) # Define model model = ps.SINDy(feature_library=pde_lib, optimizer=optimizer, feature_names=["V"]) # Fit model model.fit(V_t[:, :, np.newaxis], t=t.ravel()) model.print() ```

The problem with this approach is that I would like to use a simulated path for the spatial variable (from a geometric Brownian motion) instead of a grid. Is this possible using pysindy?

Also, is it possible to identify a PDE with a dependence on the spatial variable (i.e. skip step 3)?

Jacob-Stevens-Haas commented 9 months ago

is it possible to identify a PDE with a dependence on the spatial variable?

Yes - use it as a control variable, e.g. model.fit(..., u=var, ...) for any variable which you don't want to fit an equation for, but is used in other differential equations.

Reparametrize the state space to eliminate dependence on S

Yes. If I understand the linked tutorial correctly, the measured values V=v and it's simply a matter of passing the different spatial grid (x) and time points (tau) to SINDy. It looks like that's what you've attempted in the code with spatial_grid=log_S.ravel(), and

τ = (T - np.linspace(0, T - dt, num=n)).reshape((1, -1))
t = σ ** 2 / 2 * (T - τ)
model.fit(V_t[:, :, np.newaxis], t=t.ravel())

I'm not 100% sure here, but there might be a bug in how you compute tau and t here. The values of τ range from T to 0, and you use that to simulate V_t, but then reverse the order again in t = σ ** 2 / 2 * (T - τ) (which I assume is meant to actually be the τ from the link).

QBatista commented 9 months ago

@Jacob-Stevens-Haas Thanks for the reply, it's very helpful.

I'm not 100% sure here, but there might be a bug in how you compute tau and t here. The values of τ range from T to 0, and you use that to simulate V_t, but then reverse the order again in t = σ ** 2 / 2 * (T - τ) (which I assume is meant to actually be the τ from the link).

The sign of the time derivative is flipped due to the way bsm is specified, but the functional form and coefficients are correctly recovered. The following code is perhaps easier to follow:

Code

```python import numpy as np from scipy.stats import norm import pysindy as ps # Parameters n = 1000 T = 1 σ = 0.3 K = 100 r = 0.01 dt = T / n τ = np.linspace(σ ** 2 * dt / 2 , σ ** 2 * (T - dt) / 2, num=n).reshape((1, -1)) x = np.linspace(1, 5, num=n).reshape((-1, 1)) # Black-Scholes model Φ = norm.cdf def bsm(S, K, t, r, σ): τ = T - t d1 = (np.log(S / K) + (r + σ ** 2 / 2) * τ) / (σ * np.sqrt(τ)) d2 = d1 - σ * np.sqrt(τ) return S * Φ(d1) - K * np.exp(-r * τ)* Φ(d2) def v(x, K, τ, r, σ): return bsm(np.exp(x), K, T - 2 * τ / σ ** 2, r, σ) # Simulate option price v_τ = v(x, K, τ, r, σ) # Define PDE library library_functions = [lambda x: x] library_function_names = [lambda x: x] pde_lib = ps.PDELibrary( library_functions=library_functions, function_names=library_function_names, derivative_order=3, spatial_grid=x.ravel(), include_interaction=False ) # Define optimizer optimizer = ps.STLSQ(threshold=0) # Define model model = ps.SINDy(feature_library=pde_lib, optimizer=optimizer, feature_names=["v"]) # Fit model model.fit(v_τ[:, :, np.newaxis], t=τ.ravel()) model.print() ```

Now suppose that instead of

τ = np.linspace(σ ** 2 * dt / 2 , σ ** 2 * (T - dt) / 2, num=n).reshape((1, -1))
x = np.linspace(1, 5, num=n).reshape((-1, 1))
v_τ = v(x, K, τ, r, σ)

I only have something like

τ = np.linspace(σ ** 2 * dt / 2 , σ ** 2 * (T - dt) / 2, num=n)
x_τ = np.log(100) + np.cumsum((μ - σ ** 2 / 2) * dt + σ * np.random.normal(0, np.sqrt(dt), size=(n)))
v_τ = v(x_τ, K, τ, r, σ)

so that v_τ is one dimensional instead of two-dimensional and the spatial variable varies with time. Is it possible to use SINDy in this case?

Jacob-Stevens-Haas commented 9 months ago

I don't think it is, at least not easily.

For ODE modeling, pysindy will assume the second to last axis is the time axis and calculate derivatives for the lhs of the regression. If you have a spatiotemporal random walk like that, then those datapoint-to-datapoint derivatives are actually directional derivatives.

OTOH if you want PDE terms, you would need to provide v for all points in the spatiotemporal grid - not just along a random walk.

dynamicslab / pysindy

Discovering the Black-Scholes PDE using `pysindy` #411