SMTG-Bham / doped

doped is a Python software for the generation, pre-/post-processing and analysis of defect supercell calculations, implementing the defect simulation workflow in an efficient, reproducible, user-friendly yet powerful and fully-customisable manner.
https://doped.readthedocs.io
MIT License
143 stars 32 forks source link

How to generate two defects in the same supercell? #91

Closed yw-fang closed 1 month ago

yw-fang commented 1 month ago

Hi, all

If I am not wrong, the DefectsGenerator class generates only one defect, e.g. one vacancy. Based on the doped structures, we can further dope it to gain 2-defect structure (although it may lead to more structures that are symmetry-equivalent and should be excluded manually). Except this 'iterative' method, can we generate two vacancies in the same time in the same supercell with the "doped" code excluding the symmetry-equivalent ones? Thank you!

kavanase commented 1 month ago

Hi @yw-fang, Yes currently DefectsGenerator generates only single point defects. The current workflow for generating complex defects is as you described, by feeding in the defective supercell to DefectsGenerator again, for it to generate 2-defect supercells. However, all the generated structures in this case are still symmetry inequivalent, and they shouldn't contain any duplicate / symmetry-equivalent structures (this has been tested in a number of cases and was the approach used in e.g. https://pubs.rsc.org/en/content/articlelanding/2021/sc/d1sc03775g & https://pubs.rsc.org/en/content/articlehtml/2023/ta/d3ta00532a; if you have any examples of this not being the case then please share so I can look into it). You can double check this by looking at defect/defect distances and point symmetries etc.

More efficient generation and handling of complex defects is planned in the near future. Hope that helps!

yw-fang commented 1 month ago

Hi, Seán @kavanase Thank you very much for confirming it. I'll prepare an actual example to demonstrate the duplicates I referred to.

kavanase commented 1 month ago

Hi @yw-fang, thank you! Looking deeper into the code and at my notebooks from a previous paper, I realise what I said above is actually not entirely true in certain cases.

For the defect generators, they identify the symmetry-inequivalent sites in the input structure for distinct defects. When inputting a defect supercell however, the symmetry has been broken (often P1), and doped/pymatgen/spglib are not automatically aware of the fact that the defect site can equivalently be moved/permutated with other sites in the same supercell. So doped identifies the symmetry-inequivalent sites with this fixed defect supercell. I think depending on the choice of complex defects (e.g. if they are both vacancies / substitutions), this can then result in some structures which are in fact symmetry-equivalent duplicates. The best way to deal with this would be to properly build in complex defect generation in doped – which I plan to do very soon. But for now, I would recommend generating the defects in this same way and then using the distance between the two point defect sites to quickly screen out duplicates, if that works?

yw-fang commented 1 month ago

Thanks very much for your further comment! This was exactly the issue of duplicate in the case of generating multi-vacancy structure I wanted to ask because of the P1 symmetry. As to the distance of the two vacancies, is there a built-in method of the doped code?

kavanase commented 1 month ago

Ah ok, makes sense.

Yes, you can quickly do this using doped with this:

image image image
kavanase commented 1 month ago

@yw-fang just to also note, I have recently made a lot of efficiency updates for the defect generation part of doped (particularly for large and complex input structures such as defect supercells), to make the doped and pymatgen structure manipulation functions much much faster.

These will be included in the next release of doped (hopefully soon), but for now I would recommend installing the doped develop branch to make use of these speedups. (e.g. with pip install https://github.com/SMTG-Bham/doped/archive/develop.zip)

yw-fang commented 1 month ago

@kavanase Thanks very much for the details! It works to remove some duplicates.

image One more question, in this example, you used one specific defect structure. In some cases, there are several defect structures belonging to the same type of dopant. Assuming we have 2 different sing v_Cd vacancies marked Cd@1 and Cd@2. When using the above method to generate the double defects, does it cause a double-counting issue? e.g. the double_vacancy_defect_gen method based on the Cd@1 structure will lead to a two-defect structure Cd@1&Cd@2; and that based on Cd@2 will lead to a two defect structure marked Cd@2&Cd@1. These two structures are the same, but I guess the present implementation cannot tell them because the notations to mark them are different in "doped" code. Please feel free to ask me to rephrase the question if any part of the text is unclear or ambiguous. Thanks!

kavanase commented 1 month ago

@yw-fang yes in this example I just used the one vacancy as there was only one inequivalent vacancy. I think in the case you describe, yes using both Cd@1 and Cd@2 as the 'initial' vacancy sites and then generating further Cd@1/Cd@2 vacancies will lead to duplicates (though using the method I did above that screens out duplicates based on their distance should also remove these duplicates fine). I think for this goal of complex defect generation, you only need to use one of the constituent point defects in the complex as the 'initial' defect, to generate all possible complex defect combinations in your given supercell / with whatever max-complex-distance constraints you use.

yw-fang commented 1 month ago

@kavanase Thank you very much for confirming it, and thanks again for all the above detailed responses!