dswah / pyGAM

[HELP REQUESTED] Generalized Additive Models in Python
https://pygam.readthedocs.io
Apache License 2.0
862 stars 159 forks source link

What's needed to add prediction intervals for GammaGAM? #229

Open AlexEngelhardt opened 5 years ago

AlexEngelhardt commented 5 years ago

I'm using a GammaGAM for forecasting, and I could really use prediction_intervals like they are available in LinearGAM. I understand it's a bit more complicated, and I could give it a shot, but I don't know what's necessary to implement them.

It seems there was an issue #75 already that talked about prediction intervals for the Poisson distribution, but that has been closed (unsolved, right?)

Thanks in advance for any hints!

dswah commented 5 years ago

hey @AlexEngelhardt ! sorry for the slowwww response.

the way to do this for the general case it to use the empirical sampling approach (there's some discussion here)

but here's a quick recap:

the idea is to fit a few models on bootstrapped replicates of the dataset, then generate new data from the fitted models, and compute the percentiles of those data.


import numpy as np
from matplotlib import pyplot as plt

from pygam import PoissonGAM, s, te
from pygam.datasets import chicago

X, y = chicago()
gam = PoissonGAM(s(0, n_splines=200) + te(3, 1) + s(2)).fit(X, y)

# sample 500 new predictions by fitting 5 models on bootstraps of the data
samples = gam.sample(X, y, quantity='y', n_draws=500, sample_at_X=X, n_bootstraps=5)

# now compute percentiles of the sampled data
q = [2.5, 97.5]
percentiles = np.percentile(samples, q=q, axis=0).T

# and plot
plt.plot(y[:500])
plt.plot(percentiles[:500])

image

please let me know how this works for you...

dswah commented 5 years ago

@AlexEngelhardt did this work for you?

AlexEngelhardt commented 5 years ago

Hi Daniel,

sorry from me too - I took a long time to respond :)

Your code worked perfectly. Do you consider this just a workaround or do you think this would fit into pyGAM as well? If yes, I can put it in and submit a pull request.

dswah commented 5 years ago

@alexengelhardt im very glad it worked for you.

this is the correct way to compute prediction intervals for non-gaussian distributions.

it definitely belongs in the prediction_intervals method for non-gaussian models.

i would love to get a pull request for this!!

-dani

dswah commented 5 years ago

it should be easy to close https://github.com/dswah/pyGAM/issues/156 with this method.