dswah / pyGAM

[HELP REQUESTED] Generalized Additive Models in Python
https://pygam.readthedocs.io
Apache License 2.0
875 stars 160 forks source link

Order of spline #207

Closed dibyoghosh closed 6 years ago

dibyoghosh commented 6 years ago

In earlier versions of pyGAM I was able to set the spline_order parameter, which signifies the order of the spline I want to fit. That way I could specify if I want a linear spline, a cubic spline or higher.

  1. Is there still a provision to specify the order of the spline?

  2. Does the model output (summary or otherwise) provide the information on the order of the spline which is used to fit the model?

daventero commented 6 years ago

I guess you could use spline_order argument anyway, through the s() function.

I have tested it and it works perfectly for me.

from pygam import LinearGAM, s, f
from pygam.datasets import wage

X, y = wage()
gam = LinearGAM(s(0, spline_order=5) + s(1, spline_order=5) + f(2)).fit(X, y)

About 2 the answer is yes and no. You cannot access term's spline order (if applies) through summary at the moment, as far as I'm concerned, but you could do something like this.

for term_i in gam.get_params()['terms']:
    print(term_i, "\t", term_i.spline_order if hasattr(term_i, "spline_order") else None)

# spline_term            5
# spline_term            5
# factor_term            0
# intercept_term     None
dibyoghosh commented 6 years ago

Exactly what I was looking for. This helps. Thanks.

dswah commented 6 years ago

@dibyoghosh how might we document the spline_order api so that it is easier for users to discover?

also, @daventero, thanks for your great answer.

thanks -dani

dibyoghosh commented 6 years ago

For my usage, I would only need an exposition as a part of the summary. Maybe a column next to rank, which provides the order of the spline corresponding to each term. I observed that you the spline_order of each term can be different, just as the n_splines. So in the final output, it would help if each term in gam.summary is listed with the both the Rank (n_splines) and Order (spline_order).