openopt / copt

A Python library for mathematical optimization
http://openo.pt/copt
Other
135 stars 35 forks source link

(H)omotopy CGM #95

Open gideonite opened 3 years ago

gideonite commented 3 years ago

This is a first step towards implementing Homotopy CGMs into copt.

Based closely on the existing Frank-Wolfe implementation, this incorporates additional linear constraints into the picture via homotopy smoothing.

In this commit there are element-wise and row-wise constraints.

The next step is to provide randomized versions of this algorithm.

gideonite commented 3 years ago

I should add some unit tests for this as well, but I thought that now might be a good time to get feedback. I can convert this to a github draft if you like.

coveralls commented 3 years ago

Coverage Status

Coverage increased (+0.03%) to 88.76% when pulling 583846a0e37f06331d1a08d2cf9f88659f595e52 on homotopy_pr into e5cb995065e26acd543ef13a330351eb1c381bdd on master.

gideonite commented 3 years ago

Note to self: are there places where I could make the code faster? Specifically, but using numba / utils.njit? There aren't any tight loops where I am going over an array. The basic Frank-Wolfe implementation doesn't have numba stuff so maybe we can't either, but I can help thinking that there might be a way to accelerated the main for-loop of the algorithm.

gideonite commented 3 years ago

Super, thanks you! I'll do these things and update.

gideonite commented 3 years ago

Rather than this clumsy use_eigs option that was there before, I replaced it with a new lmo, TraceSpectrahedron, which optimizes over the psd matrices with bounded trace norm. This is actually what I originally wanted but I was a bit confused.

Other than that, I added the comments, changed the names, etc. as you requested. @fabianp

Still to do:

  1. [unit tests] I don't need feedback on this (yet). I just need to do it.
  2. [figure out how dataset downloading should work] I'm not sure how to proceed or whether you have an opinion. I can add the dataset to the google storage but I need access. Or, I download directly from the source. The task is clustering so I'm not sure how much of the _load_dataset function should be reused. @fabianp
  3. Minor point. I was a bit optimistic and thought that I could make a clear single commit PR. I'm not sure if you care about the history, but let me know and I'll squash and open a new PR with a single commit. Or, perhaps there's a better way?
fabianp commented 3 years ago

No worries about the history, it's fine if there are multiple commits

fabianp commented 3 years ago

another option for the dataset is to use something more standard that already includes mnist like tf-datasets, https://www.tensorflow.org/datasets/overview

gideonite commented 3 years ago

The only problem with that is that the dataset that we are using are the outputs of a softmax. We could train the simple model dynamically when the dataset is downloaded or download directly what was used in [MVW16]. For now, I decided to download it directly.

Ok, I think it's finally ready for merge! What do you think? @fabianp

[MVW16] D. G. Mixon, S. Villar, and R. Ward, “Clustering subgaussian mixtures by semidefinite programming,” May 2016. http://arxiv.org/abs/1602.06612, GitHub Repo: https://github.com/solevillar/kmeans_sdp

gideonite commented 3 years ago

Fixed a small bug and added a unit test to test for it next time. Apparently, this unit test also improves the code coverage above the threshold for coveralls (finally).

fabianp commented 3 years ago

thanks @gideonite l it all looks good to me.

@GeoffNN do you mind taking a look a this ? Thanks

GeoffNN commented 3 years ago

I'll look at it ASAP!

gideonite commented 3 years ago

Does anyone else have feedback? This has been in purgatory for a while and I'd like to get it merged. @GeoffNN @fabianp

GeoffNN commented 2 years ago

@gideonite quick ping on this, if you want to wrap it up!

gideonite commented 2 years ago

Hi @GeoffNN, sorry for the delay. Been caught up with other things and this has fallen by the wayside. I will need some more time to get back to this but if there's still interest in getting this merged, I can probably easily address most of these issues.

GeoffNN commented 2 years ago

The remaining issues should be straightforward / it would also be fine to leave TODO items for a subsequent push :)