aoterodelaroza / critic2

Analysis of quantum chemical interactions in molecules and solids.
Other
97 stars 35 forks source link

Request for machine-readable output #28

Closed mkhorton closed 3 years ago

mkhorton commented 4 years ago

Hi @aoterodelaroza, I've written a parser for the critic2 standard output so that we can store the results of an analysis in our database. Unfortunately a recent change in output format broke the parser -- this is fine and not a problem, but I was wondering if you'd thought about providing a machine-readable output like JSON in addition to the standard out?

A machine readable output as JSON could look something like:

{
    unique_critical_points: [
        {
         'field': 93848.0413,
         'field_gradient': 0.0,
         'field_hessian': [[-2593274446000.0, -3.873587547e-19, -1.704530713e-08],
                           [-3.873587547e-19, -2593274446000.0, 1.386877485e-18],
                           [-1.704530713e-08, 1.386877485e-18, -2593274446000.0]],
         'frac_coords': [0.333333, 0.666667, 0.213295],
         'index': 0,
         'multiplicity': 1.0,
         'point_group': 'D3h',
         'type': "nucleus"
         },
         { ... }
     ],
     all_critical_points: [ ... ],
     critical_point_connections: [ ... ]
}
aoterodelaroza commented 4 years ago

Sure, that shouldn't be very hard to do. I'll try to get it done by the end of the week.

mkhorton commented 4 years ago

Fantastic, thank you :-) Happy to beta test.

aoterodelaroza commented 4 years ago

This is implemented now. You can do CPREPORT bleh.json to write a JSON file that contains the structure, the details for the reference field, and the CP list for that field, with all the information. Let me know how it goes (and sorry for the wait).

mkhorton commented 4 years ago

Excellent, will test ASAP and report back any issues.

bjoe2k4 commented 4 years ago

a few nits:

mkhorton commented 4 years ago

Ellipticity is straight forward enough to calculate if you have the Hessian / Hessian eigenvalues (which are reported), so I'm personally not as concerned about that.

The graph info is important however (if not the path itself, at least the connectivity information, i.e. which ncps are connected to a given bcp).

Forgive the basic question, but what do the repulsor/attractor properties mean?

samblau commented 4 years ago

including YT info (if YT is run) in the JSON would also be greatly appreciated

aoterodelaroza commented 4 years ago

Sorry for the wait. Let's see:

bjoe2k4 commented 4 years ago

This was the first test of the json cpreport on elemental beryllium from elk (repulsors present, signature wrong for nuclear cps):

{
      "id": 1,
      "name": "Be",
      "fractional_coordinates": [0.16666667000000,0.16666667000000,0.25000000000000],
      "cartesian_coordinates": [1.07964780442629,0.62333495046670,1.69328910125000],
      "multiplicity": 2,
      "point_group": "D3h",
      "rank": 3,
      "signature": -1,
      "field": 3.55518737889062E+01,
      "field_valence": 3.55518737889062E+01,
      "gradient": [0.00000000000000E+00,0.00000000000000E+00,0.00000000000000E+00],
      "hessian": [[-9.71529893156452E+11,-2.00367926788645E-04,-8.38056666004389E-09],[-7.99803206535014E-05,-9.71529893156453E+11,-1.85118959031637E-06],[-8.38056666004389E-09,-1.85118959031637E-06,1.93805965463654E+12]],
      "gradient_norm": 0.00000000000000E+00,
      "gradient_norm_valence": 0.00000000000000E+00,
      "laplacian": -5.00013167636694E+09,
      "laplacian_valence": -5.00013167636694E+09,
      "hessian_eigenvalues": [-9.71529893156453E+11,-9.71529893156452E+11,1.93805965463654E+12],
      "number_of_pointprops": 0,
      "repulsor_angle": 0.00000000000000E+00,
      "repulsor_eigenvec": [0.00000000000000,0.00000000000000,0.00000000000000],
      "repulsors": [{
        "cell_id": "-1025312208",
        "lvec": [-1025312252,32764,0],
        "distance": 0.00000000000000E+00,
        "path_length": 0.00000000000000E+00
      },{
        "cell_id": "32764",
        "lvec": [0,0,0],
        "distance": 0.00000000000000E+00,
        "path_length": 0.00000000000000E+00
      }],
      "is_nucleus": true
    }

The normal output seems ok:

# ncp   pg  type   CPname         position (cryst. coords.)

       mult  name            f             |grad|           lap
  1    D3h (3,-3) nucleus   0.16666667   0.16666667   0.25000000    2     Be      3.55518738E+01  0.00000000E+00 -5.00013168E+09
aoterodelaroza commented 4 years ago

Yeah, that's also a bug. I'll fix that too. Stay tuned.

mkhorton commented 4 years ago

Regarding the YT, I think adding a JSON output to the YT/BADER keyword sounds cleanest / would make the most sense, rather than mixing information between different keywords. Thank you for the clarification on repulsors/attractors! I wasn't familiar with that naming.

aoterodelaroza commented 4 years ago

No worries! Okay, I've pushed a commit to move the connectivity info from the non-equivalent CP list to the complete CP list in the JSON file. I had also mixed up attractors (for bcps) and repulsors (for rcps) so those are correct now. The combination of cell_id and lvec is basically the contents of the Complete CP list, bcp and rcp connectivity table in the output. The distance and path_length are the {r1,r2} and {p1,p2} in Analysis of system bonds and Analysis of system rings.

aoterodelaroza commented 4 years ago

I believe the problem with the rank and signature at the nuclei is fixed now.

aoterodelaroza commented 4 years ago

The JSON output for YT/BADER is done. You can now do:

YT JSON bleh.json

to write a JSON file containing the results of the integration. Works also for BADER.

mkhorton commented 4 years ago

Hi @aoterodelaroza, I had two questions for clarification:

  1. The standard output seems to be in Ångström, while the JSON seems to be in Bohr. Is this always the case? It might be useful to include a top-level units keyword in the output to avoid any ambiguity.

  2. When running on a molecule from a Gaussian Cube file vs say a crystal CHGCAR, my interpretation is that the analysis is essentially the same (minus the periodicity), and that the co-ordinates reported for the molecule are the co-ordinates relative to the Gaussian Cube origin/axes. I saw there were centering vectors reported also. Would these centering vectors just need to be subtracted from the nuclei co-ordinates to get back the original molecule atomic co-ordinates?

aoterodelaroza commented 4 years ago

The JSON file is basically a dump of critic's internal variables. Units are always atomic units, but I've added a "units" key in the JSON output as you suggested, just in case.

Regarding molecules: the analysis is the same but there are a few details to consider, (briefly) explained here. If you use MOLECULE to read the geometry, critic simulates the periodicity using a "molecular cell". Internally, however, it is still periodic. The output units are angstrom and referred to the molecular origin but bohr and the cube origin are used internally (and in the JSON file). The way to recover the coordinates of the original molecule is by doing molecule_centering_vector + cartesian_coordinates.

mkhorton commented 4 years ago

If units are always Bohr, I think that's ok, I wasn't certain whether critic2 always used the same units internally or whether it switched to e.g. Angstroms if the appropriate keyword was used. Thanks for clarifying!

aoterodelaroza commented 3 years ago

Closing this too, as it seems resolved. Please, let me know if there are any other JSON issues.