This PR adds support for fitting intercepts for the three solvers we are currently pursuing. It's a work in progress because
1) I haven't dealt with the Anderson updates for PGD. How does this work for the intercept, @mathurinm ? Do we just treat it as any other coefficient?
2) I'm not sure it's most practical to have it be a separate scalar for the PGD solvers once we merge the line search PR (#15). Maybe it would be better to have it be stored at the front or end of the vector of coefficients instead. What do you think?
3) It's a little awkward computing the lipschitz constant right now because I am adding a dense vector of ones at the front of the X matrix, which I guess creates a copy of the entire design matrix and could be somewhat inefficient if the matrix is very sparse (and n is reasonably large).
This PR adds support for fitting intercepts for the three solvers we are currently pursuing. It's a work in progress because
1) I haven't dealt with the Anderson updates for PGD. How does this work for the intercept, @mathurinm ? Do we just treat it as any other coefficient? 2) I'm not sure it's most practical to have it be a separate scalar for the PGD solvers once we merge the line search PR (#15). Maybe it would be better to have it be stored at the front or end of the vector of coefficients instead. What do you think? 3) It's a little awkward computing the lipschitz constant right now because I am adding a dense vector of ones at the front of the X matrix, which I guess creates a copy of the entire design matrix and could be somewhat inefficient if the matrix is very sparse (and n is reasonably large).
Anyway, it seems to work okay: