This PR adds support for a line search procedure to the hybrid solver for the PGD step. The reason for why we'd want this instead of estimating L directly is because we don't really use L that often, which means that it may not be worthwhile do compute it when it's expensive to do so and also because L is awkward to estimate when X is sparse and you want an intercept. As you can see in #15 it's typically just more effective by itself too. But the results here actually do not always agree with that finding.
And I actually think we should just use line search exclusively and drop the code for the direct computation of L, but I'm not sure everyone else would agree?
I initially considered more fancy stuff like the BB rule but I don't think it really makes any sense here.
This PR adds support for a line search procedure to the hybrid solver for the PGD step. The reason for why we'd want this instead of estimating L directly is because we don't really use L that often, which means that it may not be worthwhile do compute it when it's expensive to do so and also because L is awkward to estimate when X is sparse and you want an intercept. As you can see in #15 it's typically just more effective by itself too. But the results here actually do not always agree with that finding.
And I actually think we should just use line search exclusively and drop the code for the direct computation of L, but I'm not sure everyone else would agree?
I initially considered more fancy stuff like the BB rule but I don't think it really makes any sense here.