Open merkys opened 3 years ago
IMO structure_features
was intended more as a content negotiation feature between the client and server than something you typically would use to determine the "physics" of the material. Nevertheless, the absence of declaring features (assemblies
, disorder
, etc) does restrict the domain for the material. But, as you note in the example, it doesn't strictly work the other way - declaring a feature is not a commitment that the structure cannot have a more simple representation.
The computational difficulty in strictly knowing whether a simpler representation could exist aside, a typical scenario in which I forsee these flags being used is this:
A client fetches structures to do "normal" static DFT calculations. Hence, the user writing that client wants to exclude structures with disorder and assemblies, because those cannot easily be translated into, e.g., a VASP POSCAR file.
The client later adds the capability of transforming the OPTIMADE disorder representation into SQS supercells that can be calculated in VASP. Hence, the processing is extended to accept structures with the disorder
feature. However, the greater generality of assemblies
is not supported, so those structures are still excluded. (Even though, as you note, sometimes they could be translated into the simpler disorder
representation.)
Thank you for the explanation. So structures having sites with mixtures of chemical elements or vacancies seem to be corner cases. I would prefer some way to dispel the ambiguity, but cannot think of an elegant solution. Surely we could attempt to standardize the representation, but I am in no position to suggest putting one of the representations in front of the other.
In CIF files (ultimate truth source for the COD) vacancies are expressed by occupancy parameter, which more naturally fits in the first representation. Mixture sites usually are split into several sites with the same coordinates, and we at the COD do little to identify such sites, as the number of such entries is low.
We have revisited the topic in workshop discussion with @rartino and @blokhin and it seems that we arrived to a consensus that we are OK with assemblies
, disorder
and structural_features
describing the representation of data, not the underlying structure.
From the specification:
However, both
assemblies
anddisorder
do not directly depend on the features of structure, but on its representation by the provider. Consider these two alternative descriptions of the same structure (taken from the specification):and
Thus the structure in the first example would have structure features
[ "disorder" ]
, whereas the second one[ "assemblies" ]
.Having structure features that denote representation instead of actual structure features seems somewhat counter-intuitive to me. Could anyone confirm this was intentional, or is this a corner case?