pymc-devs / pymc-resources

PyMC educational resources
MIT License
1.94k stars 741 forks source link

Sampling performance b-splines model chapter 04 #233

Open hannes-tw opened 9 months ago

hannes-tw commented 9 months ago

Hello.

I am running into some performance issues when specifying and sampling from a linear model with b-splines in PYMC (chapter 4, R Code reference 4.76). Everything works fast as long as I keep the number of knots and/or the degree low enough (below 10 knots and 2 degrees). As soon as I increase one of the parameters above the mentioned values, the model no longer samples within ~30 seconds, but takes 20 minutes+.

Has anybody else run into similar issues? I am currently puzzled by the rapid performance drop... Any help or pointers would be appreciated.

Code

import numpy as np
import pymc as pm
from patsy import dmatrix

# data: cherry_blossoms

n_knots = 10
b_spline_degree = 2
years = cherry_blossoms["year"].values
day_of_blossom = cherry_blossoms["doy"].values

knots = np.quantile(years, np.linspace(0, 1, n_knots))
b_spline_basis_dm = dmatrix(
    "bs(year, knots=knots, degree=b_spline_degree, include_intercept=True) - 1",
    {
        "year": years,
        "knots": knots[1:-1],
        "b_spline_degree": b_spline_degree,
    },
)

b_spline_basis = np.asarray(b_spline_basis_dm)

with pm.Model() as b_spline_model:
    alpha = pm.Normal("alpha", mu=100, sigma=10)
    beta = pm.Normal("beta", mu=0, sigma=10, shape=b_spline_basis.shape[1])
    sigma = pm.Exponential("sigma", 1)
    mu = pm.Deterministic("mu", alpha + pm.math.dot(b_spline_basis, beta.T))
    likelihood = pm.Normal(
        "likelihood", mu=mu, sigma=sigma, observed=day_of_blossom
    )

    trace = pm.sample(draws=1000, tune=1000, chains=4)

Versions: name : pymc version : 5.9.1

name : numpy version : 1.25.2

Best htw

ajaya0 commented 8 months ago

can you explain more ??