theochem / procrustes

Python library for finding the optimal transformation(s) that makes two matrices as close as possible to each other.
https://procrustes.qcdevs.org/
GNU General Public License v3.0
109 stars 20 forks source link

PSDP: Woodgate algorithm #169

Closed banrovegrie closed 2 years ago

banrovegrie commented 2 years ago

This is wrt #98.

I am working within a subfolder, once done I will delete the folder and make everything streamlined. Right now, the code within PSDP isn't inheriting or using anything outside of the subfolder.

banrovegrie commented 2 years ago

The added code shows the rough structure of the algorithm. I finally understood the proof and why steepest descent doesn't work.

What's left?

banrovegrie commented 2 years ago

@FanwangM, I think we can meet on Tuesday this week and the implementation will be complete by then 😛

FanwangM commented 2 years ago

Thanks for the hard work on this. @banrovegrie

A few comments on this.

  1. Using your master branch to add new features is not a good practice as we will lose track of the upstream and the git history will not align with the upstream. Therefore, we often create a new branch for adding new codes and we will update the master branch once the upstream is updated.
  2. You can of course do experiments with a folder, which is exactly the way of understanding the algorithm. But we don't put these into the pull request, as this makes the git history dirty. You can find examples of codes in https://github.com/theochem/procrustes/blob/master/procrustes/symmetric.py
  3. I would guess that input.py functions as a test, but we don't put test files like this. An example can be found at https://github.com/theochem/procrustes/blob/master/procrustes/test/test_symmetric.py and you should create a new test file named test_psdp.py in the test folder.
  4. There are some coding style complaints at https://github.com/theochem/procrustes/actions/runs/2561364968 and they should be fixed. Maybe using PyCharm for coding is a good way.

I think you can close this PR and create a new one. You can need to

Please refer to the code review for other comments.

FanwangM commented 2 years ago

@FanwangM, I think we can meet on Tuesday this week and the implementation will be complete by then 😛

Can you send us an email every Monday morning following the email thread we have been using to inform us if you will need to meet with professors or just GSoC mentors? This can help the professors arrange the time in a more timely manner. Thank you.

Ali-Tehrani commented 2 years ago

Hi Arjo, Thanks for the pull-request and work. I won't be able to come to the meeting today, but I have some suggestions for the implementation.

To solve Eqn 26, I would think to convert via vectorizationD_i L(E) to the following image and then solve it as a system of linear equations using numpy. They already set vectorization for you in Eqn 27, and talked a bit on vectorization preceding and on Eqn 9. Note they define vectorization via stacking of rows whereas wikipedia defines it as stacking of columns and so it is something to be aware, see Notes section in wikipedia.

For solving for omega, a good place to start would be to use scipy optimize, where you use the bounds attribute to make sure the weights are all positive.

Finally, for tests I would recommend to start with small matrices and letting F=I be the identitfy matrix and G be a positive definite matrix. To generate a positive definite matrix, you could just start with a diagonal matrix with positive entries. The solution P would be a diagonal matrix with the 1/x of the positive entries.

FanwangM commented 2 years ago

Hi Arjo. Can you upload your new codes to GitHub so that we can see the latest changes? Thanks. @banrovegrie

banrovegrie commented 2 years ago

Hey, I will close this PR and open a new one. The algorithm is complete. Solving eq 26 and 27 were much easier than I thought. Also, as @Ali-Tehrani mentioned I used scipy.optimize to find the minimizer for $f(E_i - \omega_i D_i)$.