dswah / pyGAM

[HELP REQUESTED] Generalized Additive Models in Python
https://pygam.readthedocs.io
Apache License 2.0
857 stars 157 forks source link

Help writing a programmatic TermList #258

Closed amw5g closed 4 years ago

amw5g commented 4 years ago

Howdy!
I'm iterating through competing GAM models, and need some help writing more flexible TermLists. I have a list of candidate effects: eff = ['x1', 'x2', 'x3'] And for each I have a known term: terms = ['LinearTerm', 'SplineTerm(n_splines=15)', "FactorTerm(coding='dummy')"] Right now, I have to construct the terms explicitly; e.g., gam = LinearGam(l(0) + s(1, n_splines=15) + f(2, coding='dummy')) When I want to permute the terms to, say, keep only terms 0 and 3, I have to rewrite. I'd love to construct an iterator, or better a dictionary of combinations and then pass each combination to a GAM() call.

I've looked at pygam.terms.TermList.build_from_info(), but can't figure out the right syntax.
Any help would be greatly appreciated!

amw5g commented 4 years ago

I've made some progress. The trick was that I had been treating a TermListlike, well, a list. And doing something like this:

term_list = TermList()
term_list.append(term1)

TermList has no attribute append, so that failed.

What I've got working is to build an ordered dictionary:

from collections import OrderedDict 
vars_dict = OrderedDict({
    'x1':{'term':'s', 'n_splines':15},
    'x2':{'term':'s', 'n_splines':15},
    'x3':{'term':'f'},
    'x4':{'term':'l'},
    'x5':{'term':'l'}
}

and then enumerate over the dictionary keys to build the term list using the + operator:

term_list = TermList()
for i,v in enumerate(vars_dict .keys()):
    if vars_dict[v]['term'] == 's':
        term = SplineTerm(i, n_splines=vars_dict[v].get('n_splines', 14)) #fall back on 14 if we don't have any count passed
    elif vars_dict[v]['term'] == 'l':
        term = LinearTerm(i)
    elif vars_dict[v]['term'] == 'f':
        term = FactorTerm(i, coding='dummy') #I only ever dummy code, so I can do this
    term_list += term #+, not append() or extend()

now I can use term_list in a GAM call, e.g.,

gam1 = LinearGAM(
        term_list #look at my boy shine!
        ,verbose=True
        ,max_iter =200
        ).gridsearch(X.loc[:,vars_dict .keys()].values, y)