Implement constrained optimisation

elvijs commented 3 years ago

Extend HierarchicalLasso (from https://github.com/elvijs/hierarchical_lasso/issues/2) to support constraints.

The objective is as in section 3.1 of https://arxiv.org/abs/2001.07778. The implementation is expected to differ.

Blocked by #2 and having my hands on Hugo's R implementation - this is key for checking the correctness of the new implementation.

Outcomes:

HierarchicalLasso that fully implements the strong version of https://arxiv.org/abs/2001.07778 as discussed in section 3.1.
A notebook convincing us that the new implementation is equivalent to Hugo's R implementation.

elvijs commented 3 years ago

Current thinking on the implementation:

Sample randomly from the unit sphere.
If the sampled point is feasible, do constrained gradient descent from there. Use Scipy's constrained optimisation algorithms. Make them pluggable, so that it's easy for us to test different optimisers against each other. They should be implemented as part of https://github.com/elvijs/hierarchical_lasso/issues/2.
Stop after N feasible starting points attempted.

Note:

We should monitor where each gradient descent ends up. Points in the same feasible region should end up in the same local optimum. This will give us an idea of how many 0-disconnected feasible subsets there are.
We might want to explore ways of checking if a new candidate point is in a feasible region that has already been explored. Perhaps draw a straight line between the points, pick k equidistant points on it and check if each of them is feasible in its own right? This makes use of the "polyhedron cone" shape of the feasible region.

HugoMaruri commented 3 years ago

Hi Elvijs. Nice work, a couple of comments. Depending on constraints, random sampling from sphere may take a while before it produces a point in the feasible region. Also, how to move between regions is an important point.

HugoMaruri commented 3 years ago

(Also I may need some guidance until I can find my way around github ;) )

elvijs commented 3 years ago

Hey, @HugoMaruri welcome to GitHub! Let me know if you'd like a tour over its key features :)

Yes, good point on the random sampling potentially taking time. I'd still like to start with that as it's super easy to implement and we can then improve as needed. However, shout if you disagree!

Note that I've read your PDF on the R code and am writing up a response. Expect an email today or tomorrow :)

Separately, I actually have a conjecture that the feasible region consists of precisely two 0-disconnected sets. This is because each row in the constraint matrix defines a "polyhedron cone" with some "free dimensions" (that is, some directions are ignored). It feels like intersecting two such cones cannot introduce "holes" or additional disjoints. That is, of course, not a proof.

I'm thinking of trying to disprove this hypothesis via the random sampling of starting points for the gradient descent. The set of distinct termination points for the different gradient descents is equal to the number of disjoint feasible regions. If we consistently see that there are 2 distinct termination points regardless of the number of randomly drawn feasible starting points, then we might want to look into whether we can prove this.

elvijs commented 3 years ago

Update: as @HugoMaruri points out, the conjecture above is false. One way of seeing it is via the visualisation in https://chart-studio.plotly.com/~elvijs/17/#/plot.

elvijs / hierarchical_lasso

Implement constrained optimisation #1