theochem / procrustes

Python library for finding the optimal transformation(s) that makes two matrices as close as possible to each other.
https://procrustes.qcdevs.org/
GNU General Public License v3.0
114 stars 21 forks source link

L-1 procrustes #115

Open choltz95 opened 3 years ago

choltz95 commented 3 years ago

Hi, Thanks for the project! I'm interested in solving procrustes using a more robust metric of dispersion than F norm.

I saw a couple papers online that solve e.g. smooth surrogates of l1, but wondering if you have any pointers in particular.

FanwangM commented 3 years ago

Thanks for the nice words!

We don't have any implementation to support l1 norm-based Procrustes (I am not familiar with that either). But we may add this as a feature in the future depending on other project progress and time avaliability. I found some interesting papers,

  1. https://link.springer.com/article/10.1007%2Fs10851-008-0077-2
  2. https://www.sciencedirect.com/science/article/pii/S0167739X03000438
  3. https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8378175

Are you looking for the implementation of l1 norm-based Procrustes ? @choltz95

choltz95 commented 3 years ago

Thank you so much for the resources! I really appreciate it.

I guess that second paper is the most general, but doesn't guarantee optimality.

I am happy to make a pull request some time if you think this would be relevant to your project.

Main pieces for projected gradient are in section 3.3 & Eq. 14 of this paper (same author as your #2.)

FanwangM commented 3 years ago

Wow! This is going to be a nice improvement!! Thanks. Please let me know when you need a code review or any other problems. I will tag this issue as an enhancement.

PaulWAyers commented 2 years ago

@choltz95 were you going to tackle this issue?

choltz95 commented 2 years ago

Hi yes - sorry for the delayed response. Still potentially interested. I ended up doing iteratively reweighted least squares to solve the l1 problem for my application since I did not really need a high quality solution.

Let me take another look at the papers originally posted in this thread and get back to you.

choltz95 commented 2 years ago

Hey - just started working on a prototype. I implemented a preliminary version of PGD for the smoothened l1 problem (huber) from the Trendafilov & Watson paper in Jax here.

I saw that the convergence is mediocre and needs many iterations on even some very simple testcases. I tried some simple things like gridding over geometric step lengths and early stopping. Did you find any other papers that discuss nicer solutions to this problem? There could also be some bugs.

When I initialize using solution to l-2 procrustes, result looks a little more reasonable.

I am thinking of trying a projected admm-type algorithm to solve the problem directly very quickly. I had some success on a similar problem. I feel that neither is ideal since the other methods in your library all have nice solutions, but it seems there is no way to get around iterating or potentially bad solutions for this one.

FanwangM commented 2 years ago

Thank you for your efforts! I have tried your Colab codes, but it seems that the implementation is giving the strange behavior with missing value in foc and all the error values are nan. Can you check why is that?

Screenshot from 2022-08-28 18-24-55

I think for the implementation, maybe a good check is to reproduce the results listed in the paper you mentioned. I will also do more literature reading to find if we can get better algorithms on this problem.

@choltz95

choltz95 commented 2 years ago

Sounds good. Yeah- those plots don't look right to me. Here is the result from my run:

image

Likely there is some issue with the implementation of the descent direction since the foc (eq 15, left plot) should ideally go to zero monotonically. I'll get back to you when I have some time to debug. I agree next step should be reproduce their result.

choltz95 commented 2 years ago

Hi, sorry for the late response. Quick follow-up. Took another look at my implementation. I think it may be possible that the implementation is okay - I reproduced the concrete numerical experiments in Example 2 of the paper to get similar error, even though the foc is nonzero. I think the issue may simply be that projected gradient method may not be ideal for this kind of problem with orthogonality constraint.

I am thinking this: since there is already some api support for weighted least squares procrustes (with weights on the rows of A / the "samples"), it is probably easier to do something naive like iterative reweighting: https://en.wikipedia.org/wiki/Iteratively_reweighted_least_squares. This way we remain within the procrustes api and require minimal effort while supporting some notion of robust / l1-procrustes. I think there could be some problems like convergence is usually pretty slow, solution won't change much from least squares, but maybe it's better then projected gradient.

Let me know if you would like me to proceed!

FanwangM commented 2 years ago

Looks like a good feature to have. Please go ahead and make a pull request. Can you tag me when you are done? Thank you for making our package grow! @choltz95