Closed ardunn closed 6 years ago
For reference, the topic on the pymatgen help group
I'm a bit confused by this too. It seems clear the first ordering sets one Cu:0.4
site to 0, and the other Cu:0.4
site to 1, which is not completely unreasonable but obviously the overall composition is wrong.
I see this is @computron's code/Anubhav has already commented on the Google Groups thread, do you have any ideas where the rounding from 0.4->0.5 occupancy is happening before I take a closer look at this? Seems like this must be happening in EnumlibAdaptor.
The max_disordered_sites
option is supposed to be a way to help you find a good max_cell_size
parameter. Rather than a single max_cell_size
, it keeps incrementing max_cell_size
(starting small and getting higher) until EnumlibAdapter returns a solution. If EnumlibAdapter finds any solution, the iteration will quit. Thus, it tries to find the minimum max_cell_size
that produces some solution from EnumlibAdapter, thus saving large computation time from having max_cell_size cranked up too high.
For this structure, you need a cell size of at least 5 in order to get the occupancies to work out. However, EnumlibAdapter already starts returning structures (with the wrong composition, Cu3Te2) at max_cell_size=2. e.g. running the code with EnumerateStructureTransformation(max_cell_size=2)
will produce the same results.
Clearly, the assumption that the point at which EnumlibAdapter produces any solution is a good "stopping" point for the max_cell_size
parameter is incorrect. Actually, I have no idea why EnumlibAdapter doesn't produce any solution for max_cell_size=1 but does so for max_cell_size=2.
Anyone with more idea of the inner workings of when EnumlibAdapter does and doesn't return a solution could probably help in designing a better solution to this problem ...
Note also that the intention of max_disordered_sites
was to help make EnumerateStructureTransformation a bit more automatic.
That said, I think this needs a rework. Maybe a separate wrapper for the automatic function. Also the automatic function can:
I think the problem is in enumlib itself. I did some basic debugging and the structure supplied to enumlib is not changing between loops. So somehow, enumlib is assuming that a 0.4 occupancy can be satisified by a Cu3Te2.
We could apply some sanity checks on the output / enforce a 'strict' mode to filter these incorrect composition structs out.
Agreed the ultimate issue is enumlib. It's too much of a black box for my liking.
Would also be happy to adopt the LCM approach suggested, I've tried something similar previously, possibly in MagOrdering, would have to check. I know we have DiscretizeOccupanciesTransformation
now too which would take care of part of that approach.
We could apply sanity checks, but I think let's confirm that the problem is indeed in enumlib, and not in some weird issue in pymatgen's generation of enumlib input files. If there is indeed a problem with enumlib, we should post an issue on enumlib itself and let Gus Hart know. On our end, we should probably implement a tolerance check that the compositions returned matches the intended ones to within a certain tolerance.
I found the issue. The problem in this case is that the base (which is multipled by conc to generate composition ranges) is too small a number. That resulted in rounding precision errors which causes an overlap between the species and vacancy concentration ranges when generating the enumlib input file. For example, with base 8 and a supercell size of 2, the input file generated has the range of 2-4/8 for Cu, and 3-5/8 for vacancies. That resulted in a 4/8 4/8 split satisfying the enumeration, which is a Cu3Te2.
The solution is to make sure the base is large enough. I basically just applied a 10x factor to the base. This is an arbitrary factor in any case, so it does not make all that much of a difference.
The issue is closed for now. But someone should implement the aforementioned sanity checks with composition tolerances for the generated structure as well.
PMG dictator #1 has now semi-retired....
Nice catch, and duly noted! Will add it to the list ...
Note that the fixed EnumlibAdaptor caused the max_disordered test to fail. The problem is that the max disordered test was actually wrongly coded. There was a primitive cubic cell with Li 0.2, Na 0.2 and K 0.6 + 1O. If you order this with max_ordered = 5, you obviously must have a cell with 10 atoms. The test was checking that it has only 8 atoms!! The above fix actually solves this problem.
A quick postscript: @utf will be looking into coding an alternate enumerator in pymatgen using the bsym (https://github.com/bjmorgan/bsym) library. According to Alex, this library might improve upon common problems with enumlib.
Describe the bug Using
EnumerateStructureTransformation
with themax_disordered_sites
parameter specified does not return structures with correct stoichometry. For example, with the attached cif file for Cu2.8Te2 the output structures should be Cu7Te5; however, the output structures are Cu3Te2.To Reproduce
The original structure is:
The output structures are:
Expected behavior The final structures should be Cu7Te5, not Cu3Te2. Note that using
EnumerateStructureTransformation(max_cell_size=5)
returns correct stoichiometries but the code above does not.Desktop (please complete the following information):
Additional context The cu7te5.cif file is: