MRChemSoft / mrchem

MultiResolution Chemistry
GNU Lesser General Public License v3.0
30 stars 21 forks source link

Feature/relax #451

Closed moritzgubler closed 9 months ago

moritzgubler commented 1 year ago

This is a draft of a geometry optimization implementation using the SQNM method. Variable cell shape optimization would also possible.

At the moment the geometry optimization input parameters are not parsed. Can someone help me with that?

I changed the main executable to always do a geometry optimization for testing purposes. To test the program, just compile it and run mrchem. A few parameters need to be set manualy:

world_prec = 1.0e-6
Properties {
  geometric_derivative = true          # Compute geometric derivative
}
SCF {
  path_orbitals = initial_guess              # Path to orbital files
  write_orbitals = true                # Save converged orbitals to file
  orbital_thrs = 1.0e-6
}

A tighter orbital_thrs reduces the noise in the forces significantly. The "path_orbitals" and "write_orbitals" need to be set like that because after the first geometry optimization iteration, I recycle the previous wave function as the initial guess for the next scf cycle to speed up convergence.

TODOLIST:

stigrj commented 1 year ago

Thanks a lot @moritzgubler, this looks really interesting! I will have a closer look later this week.

moritzgubler commented 1 year ago

That sounds great @stigrj.

stigrj commented 1 year ago

@moritzgubler I made a pull request on your fork with an updated input parser.

moritzgubler commented 1 year ago

Hey @stigrj I just acceptet your pull request and added some sentences to the documentation. There are some things that I can think of that we should do:

Is there something I forgot?

codecov[bot] commented 1 year ago

Codecov Report

Attention: 284 lines in your changes are missing coverage. Please review.

Comparison is base (c1a5213) 71.64% compared to head (0cafe48) 70.31%. Report is 1 commits behind head on master.

:exclamation: Current head 0cafe48 differs from pull request most recent head 0c1ddde. Consider uploading reports for the commit 0c1ddde to get more accurate results

Files Patch % Lines
src/vc_sqnm/mrchem_optimizer.hpp 0.00% 133 Missing :warning:
src/vc_sqnm/sqnm.hpp 0.00% 102 Missing :warning:
src/vc_sqnm/periodic_optimizer.hpp 0.00% 27 Missing :warning:
src/vc_sqnm/historylist.hpp 0.00% 20 Missing :warning:
src/mrchem.cpp 88.23% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #451 +/- ## ========================================== - Coverage 71.64% 70.31% -1.33% ========================================== Files 186 190 +4 Lines 15022 15308 +286 ========================================== + Hits 10762 10764 +2 - Misses 4260 4544 +284 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

stigrj commented 1 year ago

I will try to figure out a smart way to handle the output asap

moritzgubler commented 1 year ago

Do you have time to look at it soon?

ilfreddy commented 1 year ago

Vacation time right now in Norway, but I guess Stig will be able to have a look at it once he is back, or at least delegate it to some of us! 😃

moritzgubler commented 1 year ago

I implemented all your suggestions except for the printing and tested the code.

I have a question about the units. The parameters that will be set in my code are in Hartree/ Bohr and my code expects Hartree/ Bohr.

Some input parameters would change if the units were angstrom. For example the initial step size. Do you think it makes sense to change them?

robertodr commented 1 year ago

I implemented all your suggestions except for the printing and tested the code.

Thank you very much Moritz, especially for your patience! I can have a second review round in the weekend/Monday at the latest.

I have a question about the units. The parameters that will be set in my code are in Hartree/ Bohr and my code expects Hartree/ Bohr.

Some input parameters would change if the units were angstrom. For example the initial step size. Do you think it makes sense to change them?

I'm pretty sure that, at the level where you've interfaced your code, everything is guaranteed to be in atomic units already.

robertodr commented 1 year ago

I see that your branch is outdated with master and has accumulated some conflicts. Can you bring the branch up to date and fix the conflicts?

moritzgubler commented 1 year ago

The branch is up to date with master again.

moritzgubler commented 1 year ago

Hi @robertodr I just brought the branch up to date with master again. Do you have time for another round of review in the next days?

moritzgubler commented 1 year ago

Do you think of unit tests or integration tests? An integration test would be quite easy to set up but it would probably need a couple of minutes of runtime. A good test system there would be H_2O. Molecules with only two atoms are less optimal for testing because of all the symmetries (its just a one dimensional optimization problem).

moritzgubler commented 1 year ago

This converges in 20 minutes and 4 geometry optimization iterations on my machine. The tight world_prec is needed for accurate forces. To check if the geometry optimization worked, we could check the energy, the two bond lengths and the angle of the molecule. Would something like this be useful for testing?

world_unit = angstrom

WaveFunction {
  method = lda
}

Properties {
  geometric_derivative = true          # Compute geometric derivative
}

GeometryOptimizer{
        run = true
        use_previous_guess = True
        max_force_component = 1e-3
}

Molecule {
$coords
O          0.00000        0.00000        0.11779
H          0.00000        0.75545       -0.47116
H          0.00000       -0.75545       -0.47116
$end
}
robertodr commented 1 year ago

That is an expensive test to run 😞 In some of the integration tests, we don't run calculations to full precision, exactly to make sure that they run in a reasonable amount of time: for example, we only do one SCF cycle. Would it be possible to only do one optimization cycle and check the result of that? Would it be a reliable test?

Also, can you add a short page in the documentation showing how to run a geometry optimization? A new page is a restructuredtext (rst) plain-text document here: https://github.com/MRChemSoft/mrchem/tree/master/doc/users that you then reference here: https://github.com/MRChemSoft/mrchem/blob/master/doc/users/manual.rst

robertodr commented 1 year ago

Also, do you need to have:

Properties {
  geometric_derivative = true          # Compute geometric derivative
}

in the input? From reading the parser it seems you set it to true whenever GeometryOptimizer is active?

moritzgubler commented 1 year ago

This test is not as good, but it should also be ok:

world_prec = 1.0e-4
world_unit = angstrom

WaveFunction {
  method = lda
}

GeometryOptimizer{
        run = true
        use_previous_guess = True
        max_force_component = 1e-3
}

Molecule {
$coords
H 0.0 0.0 0.0
H 0.9 0.0 0.0
$end
}

This only takes 20 seconds. Is that still too long? When only doing 1 scf cycle, the forces will be really bad. I don't think, it really makes sense to test this. Only testing the first geometry optimization step is also not ideal, since the first iteration is much simpler than the second one. Only in the second step, I start estimating curvatures.

We can also think of using a force field just for testing the geometry optimizer.

moritzgubler commented 1 year ago

I have written the rst doc file, do you think it is OK like that?

robertodr commented 1 year ago

This test is not as good, but it should also be ok:

world_prec = 1.0e-4
world_unit = angstrom

WaveFunction {
  method = lda
}

GeometryOptimizer{
        run = true
        use_previous_guess = True
        max_force_component = 1e-3
}

Molecule {
$coords
H 0.0 0.0 0.0
H 0.9 0.0 0.0
$end
}

This only takes 20 seconds. Is that still too long? When only doing 1 scf cycle, the forces will be really bad. I don't think, it really makes sense to test this. Only testing the first geometry optimization step is also not ideal, since the first iteration is much simpler than the second one. Only in the second step, I start estimating curvatures.

We can also think of using a force field just for testing the geometry optimizer.

this would be acceptable :) Can you try with beryllium hydride (BeH2) not in a linear geometry? It's 6 electrons, so more expensive, but possibly still reasonable? The documentation looks good, you can check it here: https://mrchem--451.org.readthedocs.build/en/451/users/geometry_optimization.html (the link to the generated documentation for each PR appears in the list of checks) I'll edit it a bit because the TOC is picking up all section headers.

moritzgubler commented 1 year ago

This converges in 3 iterations and one minute. The ground state geometry seems to be linear. Do you think this is fine?

world_prec = 1.0e-4

WaveFunction {
  method = lda
}

GeometryOptimizer{
        run = true
        max_force_component = 1e-2
}

Molecule {
$coords
 Be       0.047274        0.032162       -0.275106
  H      -0.000377        2.549743       -0.704835
  H      -0.000807       -2.442007       -0.685253
$end
}
robertodr commented 1 year ago

Very fine! 🤩 The longest test we have, H2_polarizability_CUBE, takes ~120 seconds on GitHub, so something that runs in 1 minutes is quite good!

moritzgubler commented 1 year ago

Will you implement the test or should I do it?

robertodr commented 1 year ago

I can do it. Ping me Monday if it hasn't happened though :)

robertodr commented 1 year ago

I have fleshed out the test locally, but I can't seem to push to your repo...

Apart from this, I would like @stigrj to chime in when it comes to the output (both JSON and not) At least for the JSON, the format is significantly different with or without geometry optimization and we need to figure out a way to not break backwards compatibility.

moritzgubler commented 1 year ago

@stigrj @robertodr One possibility is that the json from the last SCF iteration is returned which will have the traditional scf format and write the json I return at the moment into a file to make sure that the information of the optimization trajectory is not lost. What do you think of that?

stigrj commented 10 months ago

@moritzgubler sorry for my ridiculous response time on this. I actually thought there would be much more for me to consider here, but it turns out you have sorted out most things already :slightly_smiling_face: Great work!

I think your last suggestion with a separate JSON file for the geometry iterations is quite acceptable, so if you can just make that final change we can get this PR merged. Then the regular JSON output file should contain the original input but the output of the final geometry iteration. In this way the file should be backwards compatible.

I have resolved the conflicts in the input parser with a rebase and added some minor changes to the printed output. I pushed the fixes to the feature/relax branch on my fork : https://github.com/stigrj/mrchem/tree/feature/relax, so you can continue from there if you want.

moritzgubler commented 10 months ago

@stigrj No worries:)

I may have an even better idea: I adjust the mrchem output json the way you mentioned to make sure it is backwards compatible. The geometry optimization trajectory is written as a .xyz file. That way it is is easy to visualize the converged geometry. If you also agree with this, I will implement it like that.

stigrj commented 10 months ago

Is there any standardized format for such an xyz-file that other programs use? Will it contain only the molecular geometry? I think it will be good to keep a full record of all the properties (energy contributions, gradients, etc), as is currently done in the non-backward-compatible json we have now, only dump it into a separate file. But if there is a standard xyz-format that allows for immediate visualization of the trajectory, you can of course add that as well :slightly_smiling_face:

moritzgubler commented 10 months ago

The .xyz format is pretty standardized and supported by standard visualization software. In that case, I will do both the json summary and the xyz output file.

moritzgubler commented 10 months ago

Hi @stigrj I polished the xyz output, removed the debug prints, renamed the xyz file and the json summary file such that they share the base name of the original output.

moritzgubler commented 9 months ago

@stigrj I think I adapted all the changes we recently talked about. I want to test one final thing regarding the xyz file output. I will contact you when I think the branch is ready to be merged.

moritzgubler commented 9 months ago

@stigrj The xyz writer is now also tested well and works. I have done and tested everything I wanted. If you also agree, we can merge it now.

stigrj commented 9 months ago

Great! Going in once the CI checks pass