adolgert / Glissa.jl

Spline functions for unequal coordinates in pure Julia
MIT License
0 stars 0 forks source link

penalized splines? #7

Open slwu89 opened 1 year ago

slwu89 commented 1 year ago

hi @adolgert!

How hard would it be to introduce penalty matrices (penalizing, for example second derivatives) to these splines? Basically, if $X$ is the model matrix of the spline, $y$ the data, $\beta$ the spline coefficients, $\lambda$ a scalar parameter controlling the amount of smoothing, and $S$ a penalty matrix, we'd want to solve a penalized least squares problem of the form:

$$ \lVert y - X\beta \rVert^{2} + \lambda \beta^{\top} S \beta $$

I have an example here (https://gist.github.com/slwu89/50f81e6bbc296261e29851d3585ae554) of using JuMP to solve this kind of penalized least squares problem using a penalized cubic spline where the $X$ and $S$ matrices are produced by the "mgcv" R package, but it would be nice for the whole thing to be in Julia.

adolgert commented 1 year ago

I think we could make an API to pull out the model matrix and spline coefficients. That should certainly be possible. This repository follows Schumaker's book on Spline Functions: Basic Theory, 3rd ed, and I think that has descriptions of penalized splines. I'll take a look. If you have more you'd like to see, please add on. I read this weekend how to make autodiff work, so I'm working on that part now.

slwu89 commented 1 year ago

Based off section 5.3.1 of Simon Wood's GAM book and peeking at the mgcv source I implemented the code to make the penalized cubic splines.

The code is here: https://gist.github.com/slwu89/62729fe92e26e3bebf22f0fa46dbc575

The function penalty_crgenerates the $S$ penalty matrix for a cubic spline using as input the $B$, $D$ matrices defined in Table 5.1 of that book. Function basis_cr makes matrices $B,D,F$ using as input xk, a vector of knot locations. Function model_matrix_cr makes a model matrix $X$ using xk knot points, $F$ basis matrix, and optionally x a set of points to evaluate (if nothing, just use xk).

The matrices produced are the same as produced by mgcv. However I haven't figured out how to use JuMP correctly to fit the penalized spline yet, see the results in the gist, which differ from what happens when using the pcls (https://stat.ethz.ch/R-manual/R-devel/library/mgcv/html/pcls.html) function in mgcv, which is used to do the actual fitting. I need to find out why, or if you have some hints on how to use JuMP better to do that penalized constrained least square problem, I'd appreciate it!

slwu89 commented 1 year ago

Update, still haven't figured out why I can't get JuMP to do what I want it to do. Instead by reparameterizing the penalized least squares problem as an unpenalized least squares problem and solving with OLS, I can get the same results in Julis as with pcls in R (gold standard). We still need to figure out how to set this up as a JuMP problem for the "full feature set" where various linear constraints are needed, which cannot be simply incorporated into a reparameterization.

Let the square root symbol denote a matrix square root such that $\sqrt{S}^{\top}\sqrt{S} = S$

$$ \lVert \begin{bmatrix} y \ 0 \end{bmatrix} - \begin{bmatrix} X \ \sqrt{\lambda} \sqrt{S} \end{bmatrix} \beta \rVert^{2} = \lVert y - X\beta \rVert^{2} + \lambda \beta^{\top} S \beta $$

EDIT github latex not rendering bmatrix correctly, the equation is https://latex.codecogs.com/gif.image?\dpi{110}\lVert&space;\begin{bmatrix}&space;y&space;\\&space;0&space;\end{bmatrix}&space;-&space;\begin{bmatrix}&space;X&space;\\&space;\sqrt{\lambda}&space;\sqrt{S}&space;\end{bmatrix}&space;\beta&space;\rVert^{2}&space;=&space;\lVert&space;y&space;-&space;X\beta&space;\rVert^{2}&space;+&space;\lambda&space;\beta^{\top}&space;S&space;\beta

# estimation as a unpenalized least squares problem
using GLM
DD = sqrt(S)
yy = [y; repeat([0], size(DD,1))]
XX = vcat(X, sqrt(λ)*DD)
beta_lm = coef(lm(XX, yy))

Gives the same result as pcls now (red is the real function, the 2 wiggly ones are right on top of each other): plot_9

slwu89 commented 1 year ago

@adolgert the reason I couldn't get the same results using JuMP was completely silly. My $W$ diagonal matrix of weights was the wrong dimension. Now everything works. I think this is a good proof of concept now of setting up a penalized cubic spline in Julia and fitting with JuMP.

adolgert commented 1 year ago

Fantastic. I can get to a copy of that book through the library. I need to figure out how to get the model matrix correctly, and we should be on our way.

slwu89 commented 1 year ago

Great! I can run you though what I have so far on Friday.