dswah / pyGAM

[HELP REQUESTED] Generalized Additive Models in Python
https://pygam.readthedocs.io
Apache License 2.0
857 stars 157 forks source link

Understanding predict_mu #259

Closed amw5g closed 3 years ago

amw5g commented 4 years ago

I'm trying to replicate the predict_mu function for a LinearGam outside of python. Specifically, I have the values from coef_ and a corresponding vector of candidate input values. I would expect to multiply the two, sum them, and get the predicted value. But something's going wrong. My coef_ looks like this (adjusted so that each row represents a feature, and multiple values in a row represent spline coefs):

  Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9 Col10 Col11 Col12 Col13 Col14 Col15
0 0.663752 0.208064 0.101053 0.405164 0.521302 0.38334 0.145932 0.527738 0.374967 -0.48859 -0.30235 0.16712 -0.05632 -0.07733 0.164027
1 0.42112 0.596672 0.772241 0.947837                      
2 -0.10411 0.074421 0.778614 1.98895                      
3 0.63753 0.672153 0.702262 0.725925                      
4 0.356707 0.465944 0.74609 1.169129                      
5 0.321534                            
6 -0.55888                            
7 0.51603 0.628271 0.740579 0.852991                      
8 0.01015                            
9 0.835706 0.677995 0.602635 0.621535                      
10 0.535754 0.634896 0.734039 0.833181                      
11 0.768789 0.729764 0.664955 0.574363                      
12 0.716694 0.695241 0.673739 0.65219                      
13 1.030868 0.799866 0.568991 0.338147                      
14 0.681974 0.681629 0.684924 0.689344                      
15 0.593 0.651409 0.713672 0.779789                      
16 0.625485 0.664807 0.704128 0.74345                      
17 0.243205 0.529593 0.823353 1.14172                      
intercept 2.73787                            

And my vector of inputs looks like this:

  input
0 52
1 161
2 109856
3 6.058527
4 127
5 1
6 0
7 222.5667
8 12.42475
9 233
10 0
11 0
12 1
13 2
14 1
15 1
16 0
17 2
intercept 1

Multiplying the input by each coefficient, and summing yields 302988. predict_mu yields 12.97.

What step am I missing?

shyamcody commented 4 years ago

@amw5g Look into the `pygam._linear_predictor' function. You are not creating the model matrix as par the code. After creating the model matrix, according to b-spline or whatever basis you have used to create the coefficients, you have to dot multiply the coefficients with the model matrix.