pfnet-research / pfhedge

PyTorch-based framework for Deep Hedging
http://pfnet-research.github.io/pfhedge/
MIT License
261 stars 67 forks source link

BUG: Invalid Price time series from Heston Process #182

Closed masanorihirano closed 3 years ago

masanorihirano commented 3 years ago

The current implementation of the Heston process can make price time series including non-positive prices because of the discrete approximation.

https://github.com/pfnet-research/pfhedge/blob/main/pfhedge/stochastic/heston.py#L103

For example, we can face this issue with this code:

import torch
from pfhedge.instruments import BrownianStock
from pfhedge.instruments import HestonStock
from pfhedge.instruments import EuropeanOption
from pfhedge.nn import Hedger, MultiLayerPerceptron
torch.autograd.set_detect_anomaly(True)

def main():
    torch.manual_seed(2)
    # stock = BrownianStock(cost=1e-4, volatility=0.2)
    stock = HestonStock()
    derivative = EuropeanOption(stock)

    model = MultiLayerPerceptron()
    hedger = Hedger(
        model, inputs=["log_moneyness", "expiry_time", "prev_hedge"])

    hedger.fit(derivative, n_epochs=200)

    price = hedger.price(derivative)

    print(price)

if __name__ == '__main__':
    main()

In this code,

  1. Heston process generates no-positive value
  2. no-positive spot price cause the log moneyness of -inf
  3. MLP returns nan because of the invalid input
  4. backpropagation work incorrectly

Current implementation use the discreate approximation of the following equations: %20+%20\sigma%20\sqrt{V(t)}%20dW_2(t))

However, according to the following calculation, the first equation can be in a different form.

This transformation enables us to implement the Heston process geometrically and avoid the current issue.

masanorihirano commented 3 years ago

The fundamental problem was different.

At first, I missed the detailed implementation of this discrete approximation, but, now I have read the reference and confirmed it was implemented geometrically.

The actual problem was the price has been too small to be approximated to 0.0 as a result of the computation.

In the code I wrote above, log(-104.7081) was calculated as 0.0 in the line of the following link. https://github.com/pfnet-research/pfhedge/blob/main/pfhedge/stochastic/heston.py#L118

Now, I am being a little confused to think about how to handle this error. Practically, I should change the initial price (this is not fundamental fix), fix pfhedge, or other...?

ghost commented 3 years ago

The cause of the too-small spot seems to be an extremely large variance generated by generate_cir. I'll inspect the cause of the large variance. https://github.com/pfnet-research/pfhedge/blob/9434de4c75d44c2c646e09f24a029b0f1687f3cb/pfhedge/stochastic/heston.py#L85

These plots are spot and variance for which log(spot) dooms to nan. s v

As shown in the plot, v1 = variance[:, i_step + 1] is extremely large for i_step=8 and sok2 * v1 is dominant in the following expression of log_spot[:, i_spot=9]. Since k2 = -0.35 < 0 is negative, log_spot[:, i_spot=9] becomes largely negative and so spot goes extremely small.

https://github.com/pfnet-research/pfhedge/blob/9434de4c75d44c2c646e09f24a029b0f1687f3cb/pfhedge/stochastic/heston.py#L102-L116

code to plot ```py import torch import numpy as np from pfhedge.instruments import BrownianStock from pfhedge.instruments import HestonStock from pfhedge.instruments import EuropeanOption from pfhedge.nn import Hedger, MultiLayerPerceptron from pfhedge.nn import ExpectedShortfall torch.autograd.set_detect_anomaly(True) def main(): torch.manual_seed(2) stock = HestonStock() derivative = EuropeanOption(stock) model = MultiLayerPerceptron() hedger = Hedger( model, inputs=["log_moneyness", "expiry_time", "prev_hedge"], criterion=ExpectedShortfall(), ) try: hedger.fit(derivative, n_epochs=200) except RuntimeError: import matplotlib.pyplot as plt s = derivative.underlier.spot v = derivative.underlier.variance i = s.log().isfinite().logical_not().any(1) s = s[i] v = v[i] plt.figure() plt.plot(s.T) plt.locator_params(axis='x', integer=True,min_n_ticks=20) plt.title("spot") plt.savefig("s.png") plt.figure() plt.plot(v.T) plt.locator_params(axis='x', integer=True,min_n_ticks=20) plt.title("variance") plt.savefig("v.png") kappa: float = 1.0 theta: float = 0.04 sigma: float = 2.0 rho: float = -0.7 dt: float = 1 / 250 GAMMA1 = 1 / 2 GAMMA2 = 1 / 2 k0 = -rho * kappa * theta * dt / sigma k1 = GAMMA1 * dt * (kappa * rho / sigma - 0.5) - rho / sigma k2 = GAMMA2 * dt * (kappa * rho / sigma - 0.5) + rho / sigma k3 = GAMMA1 * dt * (1 - rho ** 2) k4 = GAMMA2 * dt * (1 - rho ** 2) print("k0", k0) print("k1", k1) print("k2", k2) print("k3", k3) print("k4", k4) if __name__ == "__main__": main() ```
ghost commented 3 years ago

Just an FYI thing, with torch.set_default_dtype(torch.float64) the code ran without RuntimeError. (it does not fix the issue essentially though...)

update: retracted

masanorihirano commented 3 years ago

Even if we use torch.set_default_dtype(torch.float64), the same issue happen:

import torch
from pfhedge.instruments import BrownianStock
from pfhedge.instruments import HestonStock
from pfhedge.instruments import EuropeanOption
from pfhedge.nn import Hedger, MultiLayerPerceptron
torch.autograd.set_detect_anomaly(True)
torch.set_default_dtype(torch.float64)

def main():
    torch.manual_seed(0)
    # stock = BrownianStock(cost=1e-4, volatility=0.2)
    stock = HestonStock()
    derivative = EuropeanOption(stock)

    model = MultiLayerPerceptron()
    hedger = Hedger(
        model, inputs=["log_moneyness", "expiry_time", "prev_hedge"])

    hedger.fit(derivative, n_epochs=200)

    price = hedger.price(derivative)

    print(price)

if __name__ == '__main__':
    main()
masanorihirano commented 3 years ago

The fundamental issue occurred here: https://github.com/pfnet-research/pfhedge/blob/9434de4c75d44c2c646e09f24a029b0f1687f3cb/pfhedge/stochastic/cir.py#L97-L102

According to the reference, these seem correct. However, when the u is too near to 1, 1-u becomes almost 0. Then, next_1 was sed the logarithm of it, thus this process causes a little bit bigger numbers. However, because u is drawn from a uniform distribution, I think this issue cannot be avoided.

ghost commented 3 years ago

There's a bug here.

https://github.com/pfnet-research/pfhedge/blob/9434de4c75d44c2c646e09f24a029b0f1687f3cb/pfhedge/stochastic/cir.py#L86-L89

Here the first tensor is of size (n_paths,) while the second is (,). I wrongly assumed it's equivalent to the following expression,

s2 = ( 
     v * (tensor_sigma ** 2) * exp * (1 - exp) / tensor_kappa
     + tensor_theta * (tensor_sigma ** 2) * ((1 - exp) ** 2) / (2 * tensor_kappa)
)

But the results are different. The first tensor is unintendedly summed over the paths.

import torch

a = torch.full((5,), 3)
b = torch.tensor(2)
result = sum(a, b)
expect = a + b

print(result)
# tensor(17)
print(expect)
# tensor([5, 5, 5, 5, 5])

Consequently s^2 is O(n_paths) times greater than the correct value and eventually pinv ~ 1 / beta is O(n_paths) times greater.

I don't think ((1 - p) / (1 - u + EPSILON)).log() is the cause because it has EPSILON keeping log at most log((1 - 0.95) / EPSILON) ~ 15, which is not that seriously large (cancelled by small 1/beta anyway).

If I substitute sum(...) with the correct expression ... + ..., the issue is resolved (See "fix" below). May I patch this to pfhedge (it'll be after tomorrow though), or you want to?

fix ```py import torch from torch import Tensor from pfhedge.instruments import HestonStock from pfhedge.instruments import EuropeanOption from pfhedge.nn import Hedger, MultiLayerPerceptron from typing import Optional from typing import Tuple from torch import Tensor torch.autograd.set_detect_anomaly(True) def generate_cir( n_paths: int, n_steps: int, init_value: float = 0.04, kappa: float = 1.0, theta: float = 0.04, sigma: float = 2.0, dt: float = 1 / 250, dtype: torch.dtype = None, device: torch.device = None, ) -> Tensor: # PSI_CRIT in [1.0, 2.0]. See section 3.2.3 PSI_CRIT = 1.5 # Prevent zero division EPSILON = 1e-8 output = torch.empty((n_paths, n_steps), dtype=dtype, device=device) output[:, 0] = init_value randn = torch.randn_like(output) rand = torch.rand_like(output) tensor_kappa = torch.tensor(kappa, dtype=dtype, device=device) tensor_theta = torch.tensor(theta, dtype=dtype, device=device) tensor_sigma = torch.tensor(sigma, dtype=dtype, device=device) tensor_dt = torch.tensor(dt, dtype=dtype, device=device) for i_step in range(n_steps - 1): v = output[:, i_step] # Compute m, s, psi: Eq(17,18) exp = (-tensor_kappa * tensor_dt).exp() m = tensor_theta + (v - tensor_theta) * exp # s2 = sum( # v * (tensor_sigma ** 2) * exp * (1 - exp) / tensor_kappa, # tensor_theta * (tensor_sigma ** 2) * ((1 - exp) ** 2) / (2 * tensor_kappa), # ) s2 = v * (tensor_sigma ** 2) * exp * (1 - exp) / tensor_kappa + tensor_theta * ( tensor_sigma ** 2 ) * ((1 - exp) ** 2) / (2 * tensor_kappa) psi = s2 / (m ** 2 + EPSILON) # Compute V(t + dt) where psi <= PSI_CRIT: Eq(23, 27, 28) b = ((2 / psi) - 1 + (2 / psi).sqrt() * (2 / psi - 1).sqrt()).sqrt() a = m / (1 + b ** 2) next_0 = a * (b + randn[:, i_step]) ** 2 # Compute V(t + dt) where psi > PSI_CRIT: Eq(25) u = rand[:, i_step] p = (psi - 1) / (psi + 1) beta = (1 - p) / (m + EPSILON) pinv = ((1 - p) / (1 - u + EPSILON)).log() / beta next_1 = torch.where(u > p, pinv, torch.zeros_like(u)) output[:, i_step + 1] = torch.where(psi <= PSI_CRIT, next_0, next_1) return output def generate_heston( n_paths: int, n_steps: int, init_state: Tuple[float, float] = (1.0, 0.04), kappa: float = 1.0, theta: float = 0.04, sigma: float = 2.0, rho: float = -0.7, dt: float = 1 / 250, dtype: Optional[torch.dtype] = None, device: Optional[torch.device] = None, ) -> Tuple[Tensor, Tensor]: GAMMA1 = 0.5 GAMMA2 = 0.5 init_spot, init_var = init_state variance = generate_cir( n_paths=n_paths, n_steps=n_steps, init_value=init_var, kappa=kappa, theta=theta, sigma=sigma, dt=dt, dtype=dtype, device=device, ) log_spot = torch.empty_like(variance) log_spot[:, 0] = torch.tensor(init_spot).log() randn = torch.randn_like(variance) for i_step in range(n_steps - 1): # Compute log S(t + 1): Eq(33) k0 = -rho * kappa * theta * dt / sigma k1 = GAMMA1 * dt * (kappa * rho / sigma - 0.5) - rho / sigma k2 = GAMMA2 * dt * (kappa * rho / sigma - 0.5) + rho / sigma k3 = GAMMA1 * dt * (1 - rho ** 2) k4 = GAMMA2 * dt * (1 - rho ** 2) v0 = variance[:, i_step] v1 = variance[:, i_step + 1] log_spot[:, i_step + 1] = sum( ( log_spot[:, i_step], k0 + k1 * v0 + k2 * v1, (k3 * v0 + k4 * v1).sqrt() * randn[:, i_step], ) ) return (log_spot.exp(), variance) class HestonStockFixed(HestonStock): def simulate( self, n_paths: int = 1, time_horizon: float = 20 / 250, init_state: Optional[tuple] = None, ) -> None: if init_state is None: init_state = self.default_init_state spot, variance = generate_heston( n_paths=n_paths, n_steps=int(time_horizon / self.dt), init_state=init_state, kappa=self.kappa, theta=self.theta, sigma=self.sigma, rho=self.rho, dt=self.dt, dtype=self.dtype, device=self.device, ) self.register_buffer("spot", spot) self.register_buffer("variance", variance) def main(): torch.manual_seed(2) # stock = BrownianStock(cost=1e-4, volatility=0.2) # stock = HestonStock() stock = HestonStockFixed() derivative = EuropeanOption(stock) model = MultiLayerPerceptron() hedger = Hedger(model, inputs=["log_moneyness", "expiry_time", "prev_hedge"]) hedger.fit(derivative, n_epochs=200) price = hedger.price(derivative) print(price) if __name__ == "__main__": main() ```

update: typo corrected

masanorihirano commented 3 years ago

I also found the same issue as #184. Thus, I'll make a PR including the fix.

masanorihirano commented 3 years ago

I mean, the same issue as #184 in CIR.

ghost commented 3 years ago

Thank you so much, @masanorihirano !!

ghost commented 3 years ago

Resolved by #186

ghost commented 3 years ago

@masanorihirano Resolved with the latest release, 0.7.5. Again, thank you so much for the really quick and pertinent fix!!