szquchen / MSA-2.0

Fitting potential energy surface using monomial symmetrization approach
13 stars 4 forks source link

Future plans #4

Open TiborGY opened 2 years ago

TiborGY commented 2 years ago

I am writing this to inform you about future PRs that I am planning to make, and to make sure my proposed changes do not clash with the project, before I start writing code. Comments are welcome.

We have private forks of the old MSA project (https://github.com/Kee-Wang/PES-Fitting-MSA), where @vtajti has made some changes to code generation, making energy evaluation more computationally efficient, and we also have a variant that generates C++ code instead of Fortran. Note that these are purely code generation changes, i.e. msa.cpp is unchanged.

I am aiming to upstream these improvements and extensions into this repo, while preserving the existing algorithm and functionality, especially as our new code cannot compute analytical gradients (not yet anyways), nor use them during fitting.

My current idea on how to do that:

I plan to work on this as a side project, so no firm timeline. Comments are welcome.

szquchen commented 2 years ago

Thank you for all the contributions to the MSA project.

Your plan sounds good. Just a few comments

We also have some news about our recent development of a faster approach to evaluate the gradient (this will not affect the fitting of the PES, but the prediction of the gradients using the PES will be much more efficient). Now we rely on Mathematica to help generate the Fortran code for the efficient gradient prediction, but we would eventually add this to the msa.py. Our changes are mainly about the prediction so I don't think it will conflict with your plan.

TiborGY commented 2 years ago

Brief update: beyond the changes you can see in PR #5 , I have not spent too much time on msa.py yet. Instead, I have mostly been experimenting with different ways to modify the code generator, to make energy evaluation even more efficient.

These are very preliminary results but I run some tests on a 5-th order fit of an X6Y2Z1 system, and the Fortran code generated by the generator currently available in this repo takes ~414 µs per emsav call, whereas my current best C++ code takes only ~63 µs per call, which is faster by a factor of ~6.5. The new code also takes much less time to compile, and runs much better on a wider range of CPUs, for example speedups are even better on AMD CPUs.

Note that this speedup is achieved without any pruning or fragmentation, none of the monomials or polynomials are actually removed and the results are identical within numerical precision.

I am still looking for ways to improve performance, but after I am done, I will try to continue the work on msa.py, and upload the new code generator, plus the utility I have written to measure execution speed.