Potential Feature: Minimum Interatomic Distance in Generated Structures

As mentioned in #2, the new Monte Carlo rattling function allows to include a penalty for interatomic distances below a certain threshold. At present, this is set to use 85% of the bulk bond length as the penalty threshold, where interatomic distances below this are heavily penalised (see https://hiphive.materialsmodeling.org/_modules/hiphive/structure_generation/rattle.html). This choice was motivated by looking at V_Cd and Cd_i as example cases. Note that this is also set to ignore the defect and the distorted neighbours, as these can have bond lengths smaller than this.

It could be a good idea to have a minimum interatomic distance for creating the distortions (as well as for rattling), to avoid certain interstitials where -60/50% gives structures that will explode. Tests are currently underway to see if we can determine a reasonable default minimum interatomic distance to apply with distortions (but can optionally be changed)(note AIRSS typically uses 1.5 Angstrom for this https://airss-docs.github.io/tutorials/examples/)

The penalty for small interatomic distances should help reduce the cases where calculations crash/explode, due to too small initial interatomic distances, and the sporadic cases where the relaxations get stuck in a high-energy metastable minimum.

So I've ran tests with different rattle stdevs with different choices of d_min for the Monte Carlo rattling, notebook attached. The main results are that setting d_min to 80% of the bulk bond length by default seems to maximise performance, and that performance is generally more consistent and slightly improved with this method compared to the original random rattling function 🎉

For this, I basically re-ran the V_Cd^0 rattling tests done by @ireaml, first with the same settings to confirm I had the right setup (getting the exact same results, except for the 'localised rattle' which was slightly different? – either way not critical here as we don't recommend it), then again with the Monte Carlo functions for a range of d_min choices.

Original results: (Matching @ireaml's previous results, but not as nice a plot 😛 )

Monte Carlo rattling with d_min = 80% of the bulk bond length

From some early rough testing, I had chosen 85% of the bulk bond length as a value that seemed reasonable, for the default d_min, so I'll now update this to 80%. While of course this V_Cd_0_MCRattling_Tests.ipynb.zip behaviour won't be the same for every defect, I think it's a decent indication of a reasonable choice for now, with more usage and data in the future potentially leading to updated default choices.

I am currently running some tests changing the width parameter of the Monte Carlo rattling (controlling the error function width to decide to accept/reject MC moves), but from early empirical tests, the default of 0.1 Angstrom is a good choice and deviating from this seems likely to worsen performance. Checking anyway!

As well as that, I'd like to test some of the known, rare cases where the original rattling ended up with relaxations 'stuck' in high-energy metastable states, as I think it's likely the Monte Carlo rattling reduces the prevalence of this (as well as the already-implemented change of not rattling the defect atom or distorted neighbours), so getting data on these to run tests soon

To update, I've now also ran the same tests but varying width from the default 0.1 Å to 0.15 and 0.2 Å. Notebook attached but main results are shown here below. Basically, the result is as expected, that this parameter choice makes much less of a difference than d_min (at least at the optimal d_min of 80% bulk bond length, would make more of a difference if this was higher), and deviating from the 0.1 Å default very slightly worsens performance. Keeping default as 0.1 Å.

Remaining tests are to look at the rare cases where original rattling gave relaxations stuck in high-energy metastable states, to see if the Monte Carlo rattling does indeed reduce the prevalence of this. V_Cd_0_MCRattling_Tests.ipynb.zip

This has been discussed on the internal Slack channel but posting a summarised version here for posterity:

I’ve been running more calculations and tests with ShakeNBreak, and was looking at correlating behaviour for the cases that either (1) explode and never relax or (2) do eventually relax after several re-run relaxations, but to significantly higher energies (>1 eV higher than unperturbed) – cases we want to minimise & avoid. Of course they’re a minority of cases (_especially after some of the recent rattling d_min updates_), but still good to try avoid as much as possible as they can burn unnecessary CPU time and reduce efficiency as often they need multiple relaxations to converge, and this still ends up being a useless high-energy result.

Anyway from these tests, the conclusions were that: Whenever relaxations had positive energies for ionic steps after the first 2/3 steps, they would always finally result in a high energy structure, anywhere from 4 to 100 eV higher than the ground-state (or in some cases would never actually relax), and would typically require multiple continued relaxations to get there (compared to typically 1/2 relaxations for low-energy structures). The corollary (high energy final structures having positive energies at some point in the initial relaxations) holds in many, but not all cases. As such, the newly-implemented changes mean that if we have positive energies after the 5th ionic step in a relaxation, the folder is renamed to Bond_Distortion_X_High_Energy and ignored from there on out, meaning if this does happen, we only do one short relaxation for it and don’t waste any further resources on trying to relax something that’s almost certainly going to end up high energy.

Some other updates: So for the above mentioned minority cases where some calcs explode (before even making multiple ionic steps), it’s typically always the ±40-60% distortions where they’re pushing some atoms too close together. I’ve done a load of tests and data trawling with my, Bodoo’s, Jiayi’s and Xinwei’s data, and had some ideas about how it could be avoided, basically ignoring any distortions that would push the atoms too close together. I thought about using the atomic/ionic radii for this (can get these via pymatgen), but unfortunately there’s not really any clear cutoff points where this will cause a calc to crash (e.g. for positive charge states of a certain defect, an initial interatomic distance of 1.1 Å may be fine, but will crash for negative charge states etc.) – so very defect-dependent. E.g. for Bi-on-I antisite in BiOI, anything less than the Bi atomic radius (1.6 Å) will crash, but for I-on-Bi can go to 70% of the I atomic radius (1.4 Å) and will be fine etc So I think for this the best approach is to have a minimum interatomic distance of 1 Å (now-implemented)(pretty much anything less than this exploded in the data I’ve seen) unless Hydrogen is involved. This will reduce this occurrence in the first place, and the other edge cases where it still crashes are now pretty efficiently ignored with the above updates (tested on some systems I’ve been trying now and works v well), so their effect on efficiency is minimal.

Suggestion from David:

are you using a small potim? if you arent, you will get crashes for small distances. if you use it, it can sometimes get around the explosions

Hmm we use IBRION = 2 (conjugate gradient) as gives best performance/efficiency, which is somewhat less sensitive to POTIM than quasi-Newton, but don’t set POTIM (so uses default POTIM = 0.5). We could reduce I guess, but I wonder if and how much that would reduce efficiency (making all the relaxations take longer), and if it would help performance or not…?

(David): this is the trade off. i never ran any defects without POTIM=0.02. i payed the iron price

I’ll try re-running the data I have for these cases with small POTIMs and see if (1) they don’t explode and (2) if they give reasonable energies

An update on these tests. I tried re-running all distortions that exploded with a reduced POTIM (0.02) as suggested by @SCANLORD, and indeed they eventually relax and don’t explode :fire: Great suggestion. In most of these cases, these distortions then relax to the groundstate structure (in some cases it still yields a converged higher energy result) – though in these cases this groundstate structure was found by at least 5 other distortions (which relaxed fine with the default POTIM settings) if not all of the distortions. So, given these reduced-POTIM calcs take much much longer to run, and thus far we’ve no known case of them finding a structure with an energy lower than anything found by the non-exploding distortions, I think our new current practice of automatically omitting them is still best. (Otherwise it’s a major efficiency hit without a known/clear improvement in performance)

(David) my brain is amazing! :stuck-out-tongue:

I think this makes sense, by considering that the high-energy / extreme forces are the result of very short interatomic distances, which can be thought of as “overshooting” the distortion required to find the groundstate (e.g. for a Te dimer groundstate, starting with a distortion with two Te very very close beside eachother). But, because we use a distortion mesh approach, this means we almost definitely have a more reasonable starting interatomic distance for this same groundstate structure at +/-10 or 20% the value of this exploding distortion (i.e. is already included in our distortions which relax quickly and fine with our normal settings), which is backed up by the observations from that test set.

So will keep an eye on other exploding cases I see in my calcs, and rerun them with the reduced POTIM to see if this logic still holds in each case, but for the moment think the conclusion is that the new current approach of ignoring these distortions once we get extreme energies/forces is best, unless anyone objects!

Closing this issue now as the new (and earlier d_min) updates should greatly reduce both the prevalence and impact of this occurrence, and the current conclusion is that a 1 Å hard minimum interatomic distance is best (except if hydrogen involved), but not to go beyond that with atom-specific minimum distances as this is very defect (and charge state) dependent. Will continue to test High_Energy cases with reduced POTIMs in case any do result in a lower energy structure than other distortions, but no evidence of this so far.

SMTG-Bham / ShakeNBreak

Potential Feature: Minimum Interatomic Distance in Generated Structures #7