Closed merkys closed 4 years ago
Thinking a bit about this, is it better to formulate it this way?:
response_fields
.Both "existing" implementations complained about the cost of producing coordinate data for large amounts of entries when the user didn't explicitly need it, so it is my impression from our OPTiMaDe videocall that this needs to be addressed for v1.0. Hence, I'm marking it with the 1.0 milestone.
Thinking a bit about this, is it better to formulate it this way?:
* Some properties are marked in the specification as included by default. * Other properties are included only when included in `response_fields`.
Agree. We already have attributes Requirements/Conventions for all the properties, thus we just need to agree on the set of properties which should have 'SHOULD NOT be present unless explicitly included'.
Both "existing" implementations complained about the cost of producing coordinate data for large amounts of entries when the user didn't explicitly need it, so it is my impression from our OPTiMaDe videocall that this needs to be addressed for v1.0. Hence, I'm marking it with the 1.0 milestone.
Thanks - I was about to bring this issue up myself.
We already have attributes Requirements/Conventions for all the properties, thus we just need to agree on the set of properties which should have 'SHOULD NOT be present unless explicitly included'.
My point here was that we may want to instead agree on the set of properties that are present "by default" in the response. And then have general text that says that all others "SHOULD NOT" be present unless explicitly included.
My point here was that we may want to instead agree on the set of properties that are present "by default" in the response. And then have general text that says that all others "SHOULD NOT" be present unless explicitly included.
Sounds reasonable.
From the point of view of COD, all current fields could by present by default except:
cartesian_site_positions
nsites
species_at_sites
species
assemblies
This all seems like a very valuable addition to the specification indeed.
From the point of view of COD, all current fields could by present by default except:
* `cartesian_site_positions` * `nsites`
In which case will nsites
be an issue to communicate? Is it because you're first calculating cartesian_site_positions
before calculating its length as the value of nsites
.
In my opinion, if cartesian_site_positions
is not provided by default, at least nsites
should be, so one has an inkling of how costly it may be to request cartesian_site_positions
? But again, if its unavoidable to calculate also cartesian_site_positions
before getting the nsites
value, I can see why you put it here.
* `species_at_sites` * `species` * `assemblies`
For all others it can make sense :+1:
In which case will
nsites
be an issue to communicate? Is it because you're first calculatingcartesian_site_positions
before calculating its length as the value ofnsites
.
Exactly. In the general case to us there is no better way to get the value of nsites
.
@merkys if I understand this correctly, the issue for you is that your database uses non-equivalent atom form, right?
So, doesn't this form require the multiplicity of each non-equivalent site to be given? Or, at least, the Wyckoff name from which the multiplicity follows? So, while I understand that generating the full cartesian_site_positions
is expensive, would not generating nsites
be considerably cheaper? I.e.:
nsites = 0
for every non-equivalent site:
nsites += multiplicity*occupancy
Edit: also taking account for sites without full occupancy in the pseudocode.
It's true, though both multiplicities and Wyckoff positions tend to be absent in CIF files. Moreover, not always these pieces of data can be trusted. We have observed that different programs use the same multiplicity fields to store ontologically different data.
Ok, thanks @merkys, thas is a fair point for not including nsites
in the default response. I agree that we want the transformation of "usual cif files" into the default set of properties to be cheap.
Who writes up the PR? It needs to update the text about what is included in the response, and then in the relevant property subsections of section 6 indicate for the small set of "default" properties that they are part of that set.
Who writes up the PR? It needs to update the text about what is included in the response, and then in the relevant property subsections of section 6 indicate for the small set of "default" properties that they are part of that set.
I can take it.
Due to some properties being either derived on the fly or very voluminous (for example coordinates), it might be useful to make certain properties excluded from responses by default, but added upon request. This could be implemented by reusing
response_fields
: if a field is excluded by default, but mentioned inresponse_fields
, it MUST be present in the response.