KindXiaoming / pykan

Kolmogorov Arnold Networks
MIT License
14.94k stars 1.38k forks source link

Fitting a seemingly simple smooth function #97

Closed tvatter closed 5 months ago

tvatter commented 5 months ago

I'm trying to fit a vanilla Black-Scholes model as follows:

import numpy as np
import pandas as pd
import plotnine as pn
import torch

from kan import create_dataset

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

def black_scholes_price(k, T, r=0.0, sigma=0.2):
    """Calculate Black Scholes option price."""
    m = torch.distributions.Normal(0, 1)
    d1 = (k + (r + sigma**2 / 2) * T) / (sigma * torch.sqrt(T))
    d2 = d1 - sigma * torch.sqrt(T)
    price = m.cdf(d1) - torch.exp(-k - r * T) * m.cdf(d2)

    return price

f = lambda x: black_scholes_price(x[:, 0], x[:, 1])
dataset = create_dataset(f, n_var=2, ranges=np.array([[-1, 1], [1 / 52, 2]]))

(
    pn.ggplot(
        pd.DataFrame(
            torch.concatenate(
                [dataset["test_input"], dataset["test_label"].reshape(-1, 1)], dim=1
            )
            .detach()
            .numpy(),
            columns=["k", "T", "price"],
        ),
        pn.aes(x="k", y="price", color="T"),
    )
    + pn.geom_point()
    + pn.theme_minimal()
)

image

However, when trying a simple KAN model as follows, the loss immediately plateaus and nothing much happens:


from kan import KAN

model = KAN(
    width=[2, 1],
    grid=20,
    k=3,
    device=device,
)
model.to(device)
model.train(dataset, opt="LBFGS", steps=5, device=device, lamb=0.0)
description:   0%|                                                            | 0/5 [00:00<?, ?it/s]
train loss: 2.22e-01 | test loss: 2.20e-01 | reg: 1.83e+00 : 100%|████| 5/5 [00:00<00:00,  5.56it/s]
{'train_loss': [array(0.22165689, dtype=float32),
  array(0.22164175, dtype=float32),
  array(0.22164172, dtype=float32),
  array(0.22164172, dtype=float32),
  array(0.22164172, dtype=float32)],
 'test_loss': [array(0.21969095, dtype=float32),
  array(0.21968047, dtype=float32),
  array(0.21967997, dtype=float32),
  array(0.21967997, dtype=float32),
  array(0.21967997, dtype=float32)],
 'reg': [array(1.8225945, dtype=float32),
  array(1.8297039, dtype=float32),
  array(1.8298446, dtype=float32),
  array(1.8298446, dtype=float32),
  array(1.8298446, dtype=float32)]}

Note that I've tried various configurations of width, grid, k, lambda, etc, not sure what I am doing wrong.

KindXiaoming commented 5 months ago

Hi, could you please make a 2D heat plot to see what the target function look like?

tvatter commented 5 months ago
import matplotlib.pyplot as plt
from matplotlib import cm

x, y = (
    dataset["test_input"][:, 0].detach().cpu().numpy(),
    dataset["test_input"][:, 1].detach().cpu().numpy(),
)
z = dataset["test_label"].detach().cpu().numpy()
fig, ax = plt.subplots(subplot_kw={"projection": "3d"})
ax.view_init(elev=30, azim=110, roll=0)
surf = ax.plot_trisurf(x, y, z, cmap=cm.jet, linewidth=0.1)
fig.colorbar(surf, shrink=0.5, aspect=5)
plt.show()

image

But I find it easier to see on a 1d plot as shown above :)

KindXiaoming commented 5 months ago

How about

f = lambda x: black_scholes_price(x[:, 0], x[:, 1])

to

f = lambda x: black_scholes_price(x[:, [0]], x[:, [1]])
Fakasht10 commented 5 months ago

I was gonna suggest: import numpy as np import pandas as pd import torch import matplotlib.pyplot as plt

Set the default tensor type to double precision

torch.set_default_dtype(torch.float64)

Import the necessary module for creating the dataset

from kan import create_dataset

Set the device to use for computation

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

def black_scholes_price(k, T, r=0.0, sigma=0.2): """Calculate Black Scholes option price using the given parameters.""" m = torch.distributions.Normal(0, 1) d1 = (k + (r + sigma*2 / 2) T) / (sigma torch.sqrt(T)) d2 = d1 - sigma torch.sqrt(T) price = m.cdf(d1) - torch.exp(-k - r T) m.cdf(d2) return price

Define the function using lambda that takes a tensor of variables x

f = lambda x: black_scholes_price(x[:,[0]], x[:,[1]])

Create the dataset with defined ranges for the variables

dataset = create_dataset(f, n_var=2, ranges=np.array([[-1, 1], [1 / 52, 2]]))

Ensure the label tensor is 2D

if dataset['train_label'].ndim == 1: dataset['train_label'] = dataset['train_label'].unsqueeze(1)

Print shapes to confirm

print("Input shape:", dataset['train_input'].shape) print("Label shape:", dataset['train_label'].shape)

Convert dataset to numpy for plotting

data_np = torch.cat([dataset['train_input'], dataset['train_label']], dim=1).detach().numpy() df = pd.DataFrame(data_np, columns=['k', 'T', 'price'])

Plotting

plt.figure(figsize=(10, 6)) scatter = plt.scatter(df['k'], df['price'], c=df['T'], cmap='viridis') plt.colorbar(scatter, label='Time to Maturity (T)') plt.xlabel('Strike Price (k)') plt.ylabel('Option Price') plt.title('Black Scholes Option Pricing Visualization') plt.show()

this worked but I didn't get machine precision when training and pruning and substituting analytics functions...

KindXiaoming commented 5 months ago

Hi to get machine precision, you need both: (1) enough data; (2) enough grid size. For (2), please try grid extension for better accuracy, e.g., this example.

tvatter commented 5 months ago

f = lambda x: black_scholes_price(x[:,[0]], x[:,[1]]) solved it, thanks a lot !