Open peikai opened 1 year ago
I can't repro this. Tried about 7 times. I get 638 entries every time.
which actually contains no energy adjustment related to the GGA/GGA+U/R2SCAN scheme
I'm not very familiar with our r2SCAN/GGA mixing scheme but I think what might be happening here is that we place r2SCAN energies onto the GGA(+U) hull? If so, you wouldn't expect an adjustment on a GGA+U entry? Pinging @rkingsbury @munrojm to correct me in case I'm wrong.
I'm not very familiar with our r2SCAN/GGA mixing scheme but I think what might be happening here is that we place r2SCAN energies onto the GGA(+U) hull? If so, you woudln't expect an adjustment on a GGA+U entry? Pinging @rkingsbury @munrojm to correct me in case I'm wrong here.
That's correct, some GGA/GGA+U entries will pass through the mixing scheme unmodified, if the hull is still being built with GGA/GGA+U entries. That would occur any time there are insufficient R2SCAN entries to build the hull. In such a high dimensional chemical system as this, I'm almost certain the hull would be built in GGA/GGA+U, so getting the entry back unmodified is exactly what should happen. Please refer to the mixing scheme publication for a more thorough explanation.
As for the non-reproducibility / I'm not sure. The mixing scheme does rely on structure matching, so perhaps due to some quirk in your computer system the matching occasionally fails to consider two structures "the same"? That's a stretch but it's the best thought I have.
Also, I note that you're using the old MPRester (pymatgen.ext.matproj
). If you want to use r2SCAN data I think you would be better off using the rester from the mp-api
package, but @munrojm would know better.
Also, I note that you're using the old MPRester (
pymatgen.ext.matproj
). If you want to use r2SCAN data I think you would be better off using the rester from themp-api
package, but @munrojm would know better.
@rkingsbury Oh, it is a typo here. Actually in my test codes, the MPRester method was imported by from mp_api.client import MPRester
.
I tested it on a different computer and network, but the inconsistency issue still happened. I will try to run them with a fresh environment in a container next.
The same issue happened, even though I ran the codes in a fresh set-up container of python3.10-alpine. I think this is a rare case (the chemical system and entry) that induces this issue whereas would not happen in other chemical systems, but the potential bug may still result in entry loss or excess, thereby possibly changing the shape of the convex hull.
Can you pinpoint the bug?
@janosh Yes, it is the the structure_matcher.group_structures() methods in mixing_scheme.py 564L, that results in inconsistency when matching the structures of the entry mp-1181411-GGA+U Fe4O13
with the entry mp-1181334-GGA+U Fe4O13
. I found that the codes sometimes match the two entries together in one group, whereas sometimes divide them into two individual groups. In the latter case, there would be 638 processed entries, otherwise, there would be 637 entries.
@janosh Yes, it is the the structure_matcher.group_structures() methods in mixing_scheme.py 564L, that results in inconsistency when matching the structures of the entry
mp-1181411-GGA+U Fe4O13
with the entrymp-1181334-GGA+U Fe4O13
. I found that the codes sometimes match the two entries together in one group, whereas sometimes divide them into two individual groups. In the latter case, there would be 638 processed entries, otherwise, there would be 637 entries.
@peikai it sounds to me like you are running into an edge case related to the numerical tolerances of StructureMatcher
, where tiny numerical noise is causing those structures to sometimes match and sometimes not. I suggest 2 things:
StructureMatcher
on your list of input structures?StructureMatcher
. If you slightly change the ltol
, stol
, or angle_tol
, you can hopefully get reproducible results.
I found that the entry
mp-1181334-GGA+U
forCu-Fe-Li-O-Te
chemical system can wrongly pass the mixing process of GGA/GGA+U/R2SCAN scheme. What is weird is that this entry disappears and reappears in the processed entries result occasionally, when the mixing process is conducted multiple times -- about every third run in my test -- although the codes below been tested in the latestpymatgen
package.There may be either 637 or 638 entries in the processed entryList. The latter case would contain an entry
mp-1181334-GGA+U
, which actually contains no energy adjustment related to the GGA/GGA+U/R2SCAN scheme, as shown in the exported JSON file below.The potential bug needs to be found, as the reliability of local processing algorithms has become critical for the generation of GGA/GGA+U/R2SCAN phase diagrams^note.