stigrj commented 2 years ago

[x] Incorporate solvent effects into the computed initial energy
[x] Update test references to match the new solvent initial energy
[x] Define new user input structure for PCM parameters
[x] Adapt mrchem.x backend to accept new program input structure
[x] Implement $spheres parser that mimic G16 logic

stigrj commented 2 years ago

Current input parameters

Molecule {
$coords
Li 0.0 0.0 0.0
$end
}

Environment {
  run_environment = true/false
  kain = 5 
  max_iter = 100
  algorithm = scrf
  density_type = total/nuclear/electronic
  extrapolate_Vr = true/false
  convergence_criterion = static/dynamic
  Permittivity {
    epsilon_in = 1.0
    epsilon_out = 2.0
    formulation = exponential
  }
  Cavity {
    cavity_width = 0.5
$spheres
0.0 0.0 0.0 4.0
$end
  }
}

stigrj commented 2 years ago

Suggestion 1: Nested sections

Add keyword environment = none/PCM to Molecule section (could also be added to WaveFunction section alongside the new relativity = none/ZORA/nZORA keyword, but I think this is more like a definition of our system, i.e. molecule + environment, rather than a specification of the method, like DFT + ZORA)
Rename section Environment -> PCM
Remove keyword run_environment, replaced by Molecule.environment = PCM
Remove keyword algorithm = scrf, can be easily reintroduced when/if we ever implement an alternative to SCRF
Add section PCM.SCRF for optimization parameters
Rename keyword extrapolate_Vr = true/false -> optimizer = density/potential (not sure about the naming here, but the old name was not very descriptive)
Rename keyword convergence_criterion = static/dynamic -> dynamic_thrs = true/false
Rename keyword cavity_width -> width

Molecule {
  environment = pcm/none
$coords
Li 0.0 0.0 0.0
$end
}

PCM {
  SCRF {
    kain = 5 
    max_iter = 100
    optimizer = potential/density
    density_type = total/nuclear/electronic
    dynamic_thrs = true/false
  }
  Cavity {
    width = 0.5
$spheres
0.0 0.0 0.0 4.0
$end
  }
  Permittivity {
    epsilon_in = 1.0
    epsilon_out = 2.0
    formulation = exponential
  }
}

stigrj commented 2 years ago

Suggestion 2: Flat sections

Add keyword environment = none/PCM to Molecule section (could also be added to WaveFunction section alongside the new relativity = none/ZORA/nZORA keyword, but I think this is more like a definition of our system, i.e. molecule + environment, rather than a specification of the method, like DFT + ZORA)
Rename section Environment -> PCM
Remove keyword run_environment, replaced by Molecule.environment = PCM
Remove keyword algorithm = scrf, can be easily reintroduced when/if we ever implement an alternative to SCRF
Rename keyword extrapolate_Vr = true/false -> optimizer = density/potential (not sure about the naming here, but the old name was not very descriptive)
Rename keyword convergence_criterion = static/dynamic -> dynamic_thrs = true/false
Remove section Cavity
Rename keyword spheres -> cavity_spheres
Remove section Permittivity

Molecule {
  environment = pcm/none
$coords
Li 0.0 0.0 0.0
$end
}

PCM {
  kain = 5 
  max_iter = 100
  optimizer = potential/density
  density_type = total/nuclear/electronic
  dynamic_thrs = true
  epsilon_in = 1.0
  epsilon_out = 2.0
  formulation = exponential
  cavity_width = 0.5
$cavity_spheres
0.0 0.0 0.0 4.0
$end
}

stigrj commented 2 years ago

Comments/suggestions very welcome

robertodr commented 2 years ago

I like your suggestion 1!

environment should be under Wavefunction, as it defines a new term in the Hamiltonian.
Under cavity, we should add keywords for the alpha and beta parameters mentioned in the paper: $$R{i} = \alpha{i}R^{\mathrm{vdW}}_{i} + \beta \sigma$$
- The width should be renamed sigma
- The scaling of the van der Waals radii can be a vector, where each atom gets its own scaling. I am not sure the parser can accommodate a keyword that is either float or List[float]. If not, let's just make it a scalar: one can always set the radii explicitly and set alpha=1.0.

stigrj commented 2 years ago

How is $\alpha$ and $\beta$ handled now? Is it manually incorporated into $R_i$ and $\sigma$?

robertodr commented 2 years ago

I believe so.

stigrj commented 2 years ago

Should perhaps the SCRF threshold be set explicitly as a float, with negative value triggering the dynamic behavior?

stigrj commented 2 years ago

In the case of explicitly given $spheres, should the radii still be mapped through the $R_i \leftarrow \alpha R_i + \beta \sigma$ formula, or should they be used as is?

robertodr commented 2 years ago

Should perhaps the SCRF threshold be set explicitly as a float, with negative value triggering the dynamic behavior?

Not a big fan of sentinel values, but as long as this is documented, it's OK. What I am more worried about is if it makes sense at all to have a static threshold different from the global one.

robertodr commented 2 years ago

In the case of explicitly given $spheres, should the radii still be mapped through the Ri←αRi+βσ formula, or should they be used as is?

I have to say that I don't remember. The PCMSolver way of doing things is:

Implicit mode: centers are taken from the molecular geometry, radii from the built-in table of van der Waals radii. $\alpha$, $\beta$ and $\sigma$ are applied.
Atoms mode: centers are taken from the molecular geometry, radii from the built-in table of van der Waals radii. Two lists of atom numbers and radii are given to substitute the default radius of atom $n$ with the given value. $\alpha$, $\beta$ and $\sigma$ are applied.
Explicit mode: centers and radii are taken from $spheres. $\alpha$, $\beta$ and $\sigma$ are not applied: it's up to the user to fully specify the radii.

stigrj commented 2 years ago

I have to say that I don't remember. The PCMSolver way of doing things is:

* Implicit mode: centers are taken from the molecular geometry, radii from the built-in table of van der Waals radii. α, β and σ are applied.

* Atoms mode: centers are taken from the molecular geometry, radii from the built-in table of van der Waals radii. Two lists of atom numbers and radii are given to substitute the default radius of atom n with the given value. α, β and σ are applied.

* Explicit mode: centers and radii are taken from `$spheres`. α, β and σ are not applied: it's up to the user to fully specify the radii.

Then I think an ok compromise is to combine the latter two and allow to explicitly list all spheres with a radius, and then always apply $\alpha$, $\beta$ and $\sigma$. It's easy enough to set $\alpha=1$ and $\beta=0$ to obtain "explicit mode" behavior, albeit a bit cumbersome if you only want to change a single radius from default and then are forced to set all of them...

robertodr commented 2 years ago

Indeed, that's exactly why we added "Atoms" and "Explicit" mode in PCMSolver 😄

stigrj commented 2 years ago

How is "atoms mode" used? If you give the two lists [6] [1.0], will the 6th atom in the molecular geometry be set to R=1.0, or will all carbon atoms in the molecule be changed?

robertodr commented 2 years ago

The former: the ordinal number of the atom to substitute radius for (List[int]) together with the radii (List[float]).

stigrj commented 2 years ago

How about this?

- name: Cavity
  docstring: |
    Define the interlocking spheres cavity.
  keywords:
    - name: mode
      type: str
      default: atoms
      predicates:
        - value.lower() in ['atoms', 'explicit']
      docstring: |
        Determines how to set up the interlocking spheres cavity.
        ``atoms``: centers are taken from the molecular geometry, radii taken from tabulated data (van der Waals
        radius), and rescaled using the parameters ``alpha``, ``beta`` and ``sigma`` (R_i <- alpha*R_i + beta*sigma).
        Exceptions to default radii can be given through the ``atoms_index`` and ``atoms_radius`` lists.
        ``explicit``: centers and radii given explicitly in the ``explicit_spheres` block, no rescaling applied.
    - name: atoms_index
      type: List[int]
      default: []
      docstring: |
        List of atoms with non-default (vdW) radius. New radius given by the corresponding entry in ``atoms_radius``.
        Index corresponds to position in the molecular geometry, e.g. with [2,3] you intend to change the radius only
        for the 2nd and 3rd atom in the molecule.
    - name: atoms_radius
      type: List[float]
      default: []
      predicates:
        - len(value) == len(user['PCM']['Cavity']['atoms_index'])
      docstring: |
        List of non-default (vdW) radii. Maps to the corresponding entry in ``atoms_index`.
        The units used are the same as specified with the ``world_unit`` keyword.
        Note that the radii are rescaled before use (R_i <- alpha*R_i + beta*sigma),
        so if you want to explicitly set the final radius to a certain value, use
        alpha=1 and beta=0.
    - name: explicit_spheres
      type: str
      default: ''
      docstring: |
        Coordinates and radii of the spheres in ``explicit`` mode, given as
        $explicit_spheres
        x_0    y_0    z_0    R_0
        ...
        x_N    y_N    z_N    R_N
        $end
        The units used are the same as specified with the ``world_unit`` keyword. Note that these
        radii are *not* rescaled before use (R_i <- alpha*R_i + beta*sigma), but are used as is.
    - name: alpha
      type: float
      default: 1.1
      docstring: |
        Scaling factor on the radius term for the cavity rescaling (R_i <- alpha*R_i + beta*sigma).
        Only used in `atoms` mode, not if ``explicit_spheres`` are given.
    - name: beta
      type: float
      default: 0.5
      docstring: |
        Scaling factor on the boundary width term for the cavity rescaling (R_i <- alpha*R_i + beta*sigma).
        Only used in `atoms` mode, not if ``explicit_spheres`` are given.
    - name: sigma
      type: float
      default: 0.2
      docstring: |
        Width of cavity boundary, smaller value means sharper boundary.

robertodr commented 2 years ago

Yes, that looks good to me.

I've looked again at how G16 does things and maybe we can use something similar (https://gaussian.com/scrf/ click on "Additional Input" tab)

Default: take centers from geometry, radii from vdW table. $\alpha$, $\beta$, $\sigma$ have default values, but can be blanket-changed from input.
G16's ModifySph is Atoms mode, but if we set it as a $-$end section, then we gain the ability to have per-atom $\alpha$/$\beta$/$\sigma$
G16's ExtraSph works like default, but gives the possibility off adding spheres not centered on atoms.
G16's NSph works like Explicit mode.

2022-07-13_15-46

I'm guessing the following is a first approximation of this behavior:

- name: Cavity
  docstring: |
    Define the interlocking spheres cavity.
  keywords:
    - name: mode
      type: str
      default: atoms
      predicates:
        - value.lower() in ['atoms', 'explicit']
      docstring: |
        Determines how to set up the interlocking spheres cavity.
        ``atoms``: centers are taken from the molecular geometry, radii taken from tabulated data (van der Waals
        radius), and rescaled using the parameters ``alpha``, ``beta`` and ``sigma`` (R_i <- alpha*R_i + beta*sigma).
        Extra spheres can be added using the `$spheres` section with ``x y z R [alpha] [beta] [sigma]`` syntax.
        ``explicit``: centers and radii given explicitly in the ``spheres`` block, no rescaling applied.
    - name: spheres
      type: str
      default: ''
      docstring: |
        This input parameter affects the list of spheres used to generate the cavity.
        In ``atoms`` mode, you can give 
        $spheres
        i R [alpha] [beta] [sigma]
        $end
        to specify that the ``i`` atom in the molecule should use radius ``R`` instead of the pre-tabulated vdW radius. 
        ``alpha``, ``beta``, and ``sigma`` can also be modified, but it's not mandatory to specify them in the list.
        In ``atoms`` it's also valid to give the list of spheres as:
        $spheres
        x y z R [alpha] [beta] [sigma]
        $end
        to add a sphere to the list.
        In ``explicit`` mode the same syntax for this list: 
        $spheres
        x y z R [alpha] [beta] [sigma]
        $end
        will explicitly create the list of spheres.
        The units used are the same as specified with the ``world_unit`` keyword. Note that these
        radii are *not* rescaled before use (R_i <- alpha*R_i + beta*sigma), but are used as is.
    - name: alpha
      type: float
      default: 1.1
      docstring: |
        Scaling factor on the radius term for the cavity rescaling (R_i <- alpha*R_i + beta*sigma).
        Only used in `atoms` mode, not if ``explicit_spheres`` are given.
    - name: beta
      type: float
      default: 0.5
      docstring: |
        Scaling factor on the boundary width term for the cavity rescaling (R_i <- alpha*R_i + beta*sigma).
        Only used in `atoms` mode, not if ``explicit_spheres`` are given.
    - name: sigma
      type: float
      default: 0.2
      docstring: |
        Width of cavity boundary, smaller value means sharper boundary.

It will be a bit hairy to parse a $-$end section with optional values. Also making sure that the "blanket" application of $\alpha$, $\beta$ and $\sigma$ does not interfere when they're explicitly given will be a bit of a pain.

stigrj commented 2 years ago

Ok, this is nice, but then I think the rescaling of the radius should always be applied to make it a bit more consistent and predictable. Since alpha, beta and sigma can potentially be given in the $spheres block, then it should also be rescaled by the default amount if not given explicitly.

stigrj commented 2 years ago

In G16 the default is alpha=1.1 in all cases, no?

robertodr commented 2 years ago

In G16 the default is alpha=1.1 in all cases, no?

Yes, it is, but it's not applied to the spheres given with ModifySph, ExtraSph or NSph.

robertodr commented 2 years ago

I think I need to make some modifications to parselglossy for reading $-$end sections. Namely, it should report how many lines and how many elements in each line.

stigrj commented 2 years ago

This should now be ready to accept the new logic for the $spheres parser. What needs to be done is to replace the current get_cavity_in_program_syntax() in validators.py. This should produce the final parameters for the cavity spheres in the following format:

"cavity": {
  "spheres": array[                      # Array of cavity spheres
    {                                    # (one entry per sphere)
      "center": array[float],            # Cartesian coordinate of sphere center
      "radius": float                    # Radius of cavity sphere
    }
  ],
  "sigma": float                         # Width of cavity boundary
}

Here the radii should already have been transformed as $R_i \leftarrow \alpha R_i + \beta \sigma$ based on the logic discussed above. The current form still requires a global $\sigma$ parameter, but that shouldn't be too hard to generalize.

stigrj commented 2 years ago

Leaving for vacation tomorrow, so I won't have time to implement the $spheres parser :beach_umbrella:

robertodr commented 2 years ago

Leaving for vacation tomorrow, so I won't have time to implement the $spheres parser beach_umbrella

Enjoy your vacation 🌞 I'll try to get the parser working (and on second thought, it doesn't require a change in parselglossy)

codecov[bot] commented 2 years ago

Codecov Report

Merging #416 (9cc3e70) into master (b7ee88c) will increase coverage by 0.17%. The diff coverage is 89.43%.

@@            Coverage Diff             @@
##           master     #416      +/-   ##
==========================================
+ Coverage   68.20%   68.38%   +0.17%     
==========================================
  Files         183      183              
  Lines       15453    15448       -5     
==========================================
+ Hits        10540    10564      +24     
+ Misses       4913     4884      -29

Impacted Files	Coverage Δ
src/chemistry/Molecule.h	`86.20% <ø> (+2.87%)`	:arrow_up:
src/environment/Permittivity.h	`100.00% <ø> (ø)`
src/qmoperators/two_electron/ReactionOperator.h	`88.88% <ø> (-11.12%)`	:arrow_down:
src/qmoperators/two_electron/ReactionPotential.cpp	`100.00% <ø> (ø)`
src/qmoperators/two_electron/ReactionPotential.h	`80.00% <ø> (-20.00%)`	:arrow_down:
src/environment/Cavity.cpp	`73.33% <66.66%> (-2.18%)`	:arrow_down:
src/chemistry/Molecule.cpp	`66.91% <100.00%> (+14.25%)`	:arrow_up:
src/driver.cpp	`70.94% <100.00%> (+0.45%)`	:arrow_up:
src/environment/Cavity.h	`100.00% <100.00%> (ø)`
src/environment/Permittivity.cpp	`100.00% <100.00%> (ø)`
... and 9 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Gabrielgerez commented 2 years ago

This should now be ready to accept the new logic for the $spheres parser. What needs to be done is to replace the current get_cavity_in_program_syntax() in validators.py. This should produce the final parameters for the cavity spheres in the following format:
"cavity": {
  "spheres": array[                      # Array of cavity spheres
    {                                    # (one entry per sphere)
      "center": array[float],            # Cartesian coordinate of sphere center
      "radius": float                    # Radius of cavity sphere
    }
  ],
  "sigma": float                         # Width of cavity boundary
}
Here the radii should already have been transformed as Ri←αRi+βσ based on the logic discussed above. The current form still requires a global σ parameter, but that shouldn't be too hard to generalize.

I like this, but it seems like it hides a lot about the real input. Wouldn't it be better to show all information about each sphere in the dictionary entries? specially since we are giving the flexibility of custom values per cavity something like this maybe:

"spheres": array[            # Array of cavity spheres
  {                          # (one entry per sphere)
    "center": array[float],  # Cartesian coordinate of sphere center     
    "radius": float,         # Radius of cavity sphere
    "alpha": float,          # scaling value for radius
    "beta": float,           # scaling value for radius wrt. sigma
    "sigma": float           # Width of cavity boundary
  }
]

although this might be a bit too verbose

robertodr commented 2 years ago

I agree with Gabriel: the final computation of the radius should happen C-side, after parsing the input. That way we also get to print out this stuff in the output.

robertodr commented 2 years ago

I think I have it working, but I can't push to this branch 😿 @stigrj can you grant me permissions?

robertodr commented 2 years ago

Each sphere "object" in the JSON now contains:
- Position in Cartesian coordinates $x{i}, y{i}, z_{i}$
- Unscaled radius $R_{i}^{\mathrm{vdW}}$
- Radius scaling factor $\alpha_{i}$
- Width scaling factor $\beta_{i}$
- Width $\sigma_{i}$
The radius used will be computed as $R{i} = \alpha{i}R{i}^{\mathrm{vdW}} + \beta{i}\sigma_{i}$.
In atoms mode, we first create the list of spheres from the atomic coordinates and built-in radii, alpha (1.1), beta (0.5), and sigma (0.2) values. We then read the $spheres/$end blob of text and amend this list. Spheres can be given in two syntaxes:
- substitution: index radius [alpha] [beta] [sigma], and
- addition: x y z radius [alpha] [beta] [sigma]
These lines are parsed by two regexes. The "substitution" regex (p in the code) has higher priority: it is run and post-processed first. Note that the index in "substitution" syntax starts from 1, but it's easy to go to 0-based indexing.
In explicit mode, we start from an empty list of spheres. We then read the $spheres/$end blob of text and amend this list. Spheres can be given in two syntaxes:
- centered: index radius [alpha] [beta] [sigma], and
- free: x y z radius [alpha] [beta] [sigma]
These lines are parsed by the same two regexes, with same priority and ordering considerations, as above.
"Centered" and "substituted" spheres can be aware of their parent atom (I believe this needs some change in the C++ code though!) which means we can track their contribution to the molecular gradient. "Added" and "free" spheres are always treated as not atom-centered, so they will never contribute to the molecular gradient.
After discussions with @ilfreddy, I decided that it's less surprising for the user if the default values of alpha, beta, and sigma are always applied. If special behavior is desired, then it needs to be requested explicitly in the input.

robertodr commented 2 years ago

The printing of the cavity in the output looks like this now:

===========================================================================
                             Solvation Cavity
---------------------------------------------------------------------------
 Formulation             :                                     exponential
 Dielectric constant     :            (in)                        1.000000
                         :           (out)                        2.000000
---------------------------------------------------------------------------
    N      Radius         R_0       Alpha        Beta       Sigma        :               x               y               z
---------------------------------------------------------------------------
    0      4.0000      4.0000        1.00        0.00        0.50        :        0.000000        0.000000        0.000000
===========================================================================

a bit wonky, but I'm working on it.

robertodr commented 2 years ago

I'm pretty much done here, so @Gabrielgerez can start reviewing. @ilfreddy can you review the logic for setting radii and various scalings?

I didn't manage to prettify the output:

I think the printout needs a larger width overall. Is that possible?

I also think that the printout should show which atom the sphere belongs to, if any. Something like:

---------------------------------------------------------------------------
N   Atom  Radius      R_0    Alpha     Beta    Sigma           :               x               y               z
---------------------------------------------------------------------------
0        Li    4.0000   4.0000     1.00     0.00     0.50           :        0.000000        0.000000        0.000000
1     N/A    4.0000   4.0000     1.00     0.00     0.50           :        1.000000        1.000000       -1.000000

where the first sphere in on Lithium, while the second was added explicitly. I guess the cavity printout needs to go back in the Molecule class though, right?

stigrj commented 2 years ago

You can manually adjust the output width with the input keyword

Printer {
  print_width = 80
}

The default is now 75.

For the pretty print, perhaps separate into one list of atom spheres

---------------------------------------------------------------
    N   Atom         R_0    Alpha     Beta    Sigma   Radius
---------------------------------------------------------------
    0     Li      4.0000     1.00     0.00     0.50   4.0000
---------------------------------------------------------------

and one list of free spheres?

---------------------------------------------------------------
    N             x            y            z        Radius
---------------------------------------------------------------
    0      0.000000     0.000000     0.000000        4.0000
---------------------------------------------------------------

robertodr commented 2 years ago

I think it's better to keep one list of spheres.

stigrj commented 2 years ago

Output now looks like this:

===========================================================================
                             Solvation Cavity
---------------------------------------------------------------------------
 Formulation             :                                     exponential
 Dielectric constant     :            (in)                        1.000000
                         :           (out)                        2.000000
---------------------------------------------------------------------------
    N      R_0  Alpha Beta Sigma      Radius         x         y         z
---------------------------------------------------------------------------
    0   2.0500  1.10  0.50  0.20  ->  2.3550  0.000000  0.000000  0.000000
    1   4.0000  1.10  1.00  0.50  ->  4.9000  0.000000  0.000000  0.000000
===========================================================================

Where we have one default atoms sphere plus one explicit given in the $spheres block. No atomic symbol is printed as it would require some refactoring on the c++ side, but I think that's fine for now.

robertodr commented 2 years ago

Where we have one default atoms sphere plus one explicit given in the $spheres block. No atomic symbol is printed as it would require some refactoring on the c++ side, but I think that's fine for now.

Yeah, a C++ refactoring is needed. I am OK with postponing it as it is only strictly required for gradients.

stigrj commented 2 years ago

Are we waiting for feedback from @ilfreddy on the logic?

robertodr commented 2 years ago

Not really.

On Fri, Aug 12, 2022, at 18:45, Stig Rune Jensen wrote:

Are we waiting for feedback from @ilfreddy https://github.com/ilfreddy on the logic?

— Reply to this email directly, view it on GitHub https://github.com/MRChemSoft/mrchem/pull/416#issuecomment-1213316008, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4JOEM22NN5H22EZJIIPKDVYZ5R7ANCNFSM53BDR5XA. You are receiving this because you commented.Message ID: @.***>

MRChemSoft / mrchem

Update solvent related input #416

Current input parameters

Suggestion 1: Nested sections

Suggestion 2: Flat sections

Codecov Report