Open danieleongari opened 5 years ago
@ltalirz @yakutovicha , I want to raise the attention on this problem of converting StructureData to CifData, where the inner code is something like:
CifData(ase=structuredataobject.get_ase())
which print such as unconventional CIF format, using PyCifRW. Since we use CifData objects a lot, can we fix this?
@ltalirz @yakutovicha can you tell me if somebody looked into this issue in the past?
I want to run a raspa calculation, where I define different H-atom-types in my force field, i.e. I have to set RemoveAtomNumberCodeFromLabel
to false
. This results in a ValueError:
ValueError: invalid attribute value extra keys not allowed @ data['RemoveAtomNumberCodeFromLabel']
I assume, the setting of this parameter was switched off due to the issue described by Daniele.
@danieleongari can you briefly elaborate why you are linking to this issue from https://github.com/lsmo-epfl/aiida-lsmo/blob/0999ccec3e445cfd0dfd37a65ab013299a5f7d51/aiida_lsmo/workchains/sim_annealing.py#L154
What exactly would be needed in the CIF output to be able to set RemoveAtomNumberCodeFromLabel
to false
?
I linked to this issue to show an example on how the handling of cif/structure data in AiiDA 1 by PyCifRW was automatically assigning some sorted index as _atom_site_label
, which is later making Raspa create one atom type for every label. Even if the value of the single atom force field is fine according to the "element+underscore" convention of Raspa, but considering that all the combinations of parameters for atom-atom are printed in the output file, this could make each file >1MB bigger because of cumbersome lines.
In general, I believe one should be very careful in using different atom types for framework's atoms in these work chains, because it is likely that the label of the atom type gets altered somewhere. In our use case there was not any need for such a feature and I did never fully investigated into that: setting RemoveAtomNumberCodeFromLabel=False
was a quick and effective way to avoid any problem.
thanks for the quick explanation!
So, to summarize my understanding:
get_structure()
results in different _atom_site_label
values for each atom (index appended)RemoveAtomNumberCodeFromLabel=True
is a workaround that avoids this problem by removing the the appended index before passing the file to raspa.
Of course, this would also affect a user-defined index like in the case of @mpougin , resulting (incorrectly) in the same atom type A quick workaround for @mpougin could be to use a fantasy site label with a different letter (?)
In the medium term, we may look into making it possible to tell the CIF output not to append the indices, which would then allow us to avoid any postprocessing of the atom labels passed to raspa.
Does that sound about right?
@mpougin that does sound like a different problem - perhaps open a separate issue and include the full stack trace of the error, including information on versions of aiida-lsmo, aiida-core, python, etc.
When using
verdi data structure export -F cif {pk}
, using a structure imported as:or with other methods, the output is:
I see two problems with this: 1) the cell is printed after the coordinates, which is unconventional and it will make the parser of the most of the programs fail. 2) some unwanted indexing is appended to the element name of
_atom_site_label
.