aoterodelaroza / critic2

Analysis of quantum chemical interactions in molecules and solids.
Other
94 stars 35 forks source link

Question regarding 'duplicate' bonds (e.g. in covellite example) #7

Closed mkhorton closed 3 years ago

mkhorton commented 6 years ago

I'm trying to understand the critic2 output. Running the covellite example, I can see the lines in bcp connectivity table:

23     7    b   0.18060541    0.81931305    0.08646112   3   ( 0  0  0 )  9   ( 0  1  0 )
33     8    b   0.17624399    0.81535785    0.08636594   3   ( 0  0  0 )  9   ( 0  1  0 )

From my understanding, this is describing a bond between nucleus critical points 3 and 9, and ncp 7 and 8 are probably the same and could be merged together. This is running in auto mode.

However, I've also seen this in running my own data, e.g. for Fe3O4:

38     9    b   0.61790852    0.62269666    0.87828231   14  ( 0  0  0 )  3   ( 0  0  1 )
41     9    b   0.62269666    0.61790852    0.87828231   14  ( 0  0  0 )  3   ( 0  0  1 )

In this case, the 'duplicate' bonds are for the same non-equivalent bond critical point. How does this happen? Does have this a physical meaning, or is this an issue with my input script?

Many thanks for your help!

aoterodelaroza commented 6 years ago

This is an issue with the interpolation function for fields defined on a grid. Or, specifically, with the critical points of such function.

When you have a field on grid (for instance, a VASP CHGCAR), you need a way of representing the value of that grid and its derivatives at any point inside the cell in order to calculate its critical points. There are several ways of doing this in critic2 - the default is a tricubic polynomial interpolation. While this gives continuous derivatives, it also creates some "noise" in the regions close to the critical points, which may generate a cluster of critical points where there should be only one - I think this is what you are seeing. If you take out the LOAD lines (hence using the promolecular density), critic2 will almost invariably find the all the critical points with the default options.

This problem used to be much worse with the tricubic spline interpolator (keyword TRISPLINE) but, as you point out, the problem is not solved. I have some ideas, like trying interpolation kernels, where you assign the "weight" of a particular point using an exponential or Gaussian function of the distance to it. But the demand for this is so low that I never got around to doing it (maybe this is a good time). Still, there are some measures you can take with the existing keywords. The CPEPS option to AUTO allows you to consolidate critical points that are close to each other, for instance.

mkhorton commented 6 years ago

Interesting, I thought it might be something along those lines. I'm attempting to run critic2 on a large benchmark data set (a few hundred systems to start with), but it will all be for gridded data from VASP.

If you take out the LOAD lines (hence using the promolecular density), critic2 will almost invariably find the all the critical points with the default options.

I didn't realize this could give useful results (to run on the promolecular density, rather than ground state density), and is very interesting .... I'll give this a go! and will try CPEPS too.

The only other issue I've had (again on gridded data) is with missing bonds between H and C atoms, in hybrid organic-inorganic crystals. I read in the critic2 manual that H atoms can be tricky because the charge density can be shifted significantly from the nucleus critical point, but that doesn't seem to be the case here (visualizing the charge density shows a very clear bond, with the charge maxima where you'd expect) ... could this be related to the grid interpolation too? or is this likely unrelated?

aoterodelaroza commented 6 years ago

It could be both :). Grids are a bit of a mess when it comes to finding critical points and despite many attempts in the past I never quite solved the problem. I've seen the interpolation method making a point "disappear" but the case where you get a cluster of points is by far the most common.

In large crystals, it can happen that the default search strategy (subdivision of the WS cell) is not thorough enough. If you are not finding a particular critical point, you may want to try some of the SEED options, like SEED PAIRS, to search between atom pairs. This is the default search strategy in molecules. If your system is large or has organic bits in it, a mixed strategy with several SEED commands will probably help. See auto_simple_promolecular/pyrazole.cri or any of the examples in cps_large.

Regarding the densities themselves, you may also have problems from the pseudopotential smoothing if you use the CHGCAR directly, because in some cases like polar bonds involving H the nuclear maxima may disappear altogether (and create a bunch of spurious CPs in the process). I'd recommend going with the sum of the AECCARs always, if you have them. The displacement of the maximum is usually less than the default NUCEPSH option in AUTO (so critic2 will correctly assume they are the same).

If you want to build quantitative indices for a database of crystals, the promolecular density is certainly a good idea - people have used it in the past in this way (e.g. in nciplot). Naturally, it's just a fancy way of describing the geometry. The promolecular density consistently gives the full topology, at least in all cases I tried. If it doesn't it is a matter of adjusting the seeding strategy in AUTO.

aoterodelaroza commented 6 years ago

@mkhorton : I have been made aware of new developments regarding this problem by people at LLNL. See: https://github.com/LLNL/TopoMS I'll try to implement the recipe described in the J. Comput. Chem. article, as time permits. In the meantime, it may be interesting to give that new code a spin.

mkhorton commented 6 years ago

Hi @aoterodelaroza, many thanks for letting me know! This looks very interesting, I look forward to trying it :)

aoterodelaroza commented 5 years ago

Well, that's not good:

 =====>> Initializing TopoMS
 -- Created RegularGrid [60 60 256] with periodicity [1 1 1]
 --- now negating the value!
 function range [-30312321.296000 0.000161]
Segmentation fault

I used the example config file and the covellite AECCAR0. All the entries in the config file look reasonable to me, but I may be missing something. Did you have any luck with it?

The article is quite hard to follow, and I don't have much time at the moment to go through it slowly to unobfuscate it. The code is much easier to follow, strangely enough. The article suggested using the information about the interatomic surface to help find bonds and rings, and I think I'll try that in a sort of 2D-version of YT: sort the boundary points by density: bcps are maxima and rings are minima within the boundary. I'm not sure if this is what they propose in the article, but it sounds reasonable to me. We'll see how it goes.

mkhorton commented 5 years ago

Unfortunately not. One of the authors (Harsh) was very helpful in trying to help me get it to compile, so I do have a compiled version but it also segfaults. I was told this might be to do with non-orthogonal lattice vectors, but I’m not sure. So as of now I’m just waiting for the next release.

On Thu, Aug 30, 2018 at 08:11, aoterodelaroza notifications@github.com wrote:

Well, that's not good:

=====>> Initializing TopoMS -- Created RegularGrid [60 60 256] with periodicity [1 1 1] --- now negating the value! function range [-30312321.296000 0.000161] Segmentation fault

I used the example config file and the covellite AECCAR0. All the entries in the config file look reasonable to me, but I may be missing something. Did you have any luck with it?

The article is quite hard to follow, and I don't have much time at the moment to go through it slowly to unobfuscate it. The code is much easier to follow, strangely enough. The article suggested using the information about the interatomic surface to help find bonds and rings, and I think I'll try that in a sort of 2D-version of YT: sort the boundary points by density: bcps are maxima and rings are minima within the boundary. I'm not sure if this is what they propose in the article, but it sounds reasonable to me. We'll see how it goes.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aoterodelaroza/critic2/issues/7#issuecomment-417355204, or mute the thread https://github.com/notifications/unsubscribe-auth/AC1rRMUWcYzdURoVP1TTYPiXJT3Jwkt_ks5uWACfgaJpZM4PGqgn .

aoterodelaroza commented 5 years ago

Nah, the interatomic surface from YT seems to be composed of many non-contiguous bits, so that plan is a no-go. I'll try to find some time to read the article and the code carefully next week, see if something good comes out of it. On the topic of segfaults, in the urea crystal (orthorhombic):

 =====>> Initializing TopoMS
 -- Created RegularGrid [100 100 100] with periodicity [1 1 1]
 --- now negating the value!
 function range [-161.676920 -0.001045]

 =====>> TopoMS initialized! took 38 ms
Segmentation fault

and in the CO2 crystal (cubic):

 =====>> Initializing TopoMS
 -- Created RegularGrid [100 100 100] with periodicity [1 1 1]
 --- now negating the value!
 function range [-202.382325 -0.000868]

 =====>> TopoMS initialized! took 38 ms
Segmentation fault

I couldn't get it to work on any system, unfortunately. It always segfaults in precisely 38 ms.

aoterodelaroza commented 5 years ago

Ah, yes. This happened to me when I was coding the GUI. Add:

return true;

in TopoMS.cpp at the end of the bool TopoMS::init() function. If you declare a function as non-void and then fail to return a value, it'll segfault on a nice day, do something very weird on a bad one. I wonder why the compiler doesn't check for this. With that change it seems to be working, but it's taking an awfully long time, so I smell another rat.

aoterodelaroza commented 3 years ago

I'm closing this thread because it is the same issue as #32.