Open lilyminium opened 4 years ago
This all sounds extremely sensible to me.
For 1.0 we could break the API but if there's a way to keep the old data structures around (maybe hide them behind managed attributes that create them on the fly from the new real data structures) then that would help users who just want their old stuff to still work with the last py2 release but set us up for a clean removal for 2.0.
With properties for the old attributes we can also easily issue deprecation warnings.
Hey all, I would love to work on this if possible?
@hmacdope please go ahead! Here is a notebook that shows a bit of the current behaviour of the class. #2507 has some notes about the units of the output.
I reopen since we still have to discuss point 4. My feeling is that we decided to keep everything in MDAnalysis' standard units and do no conversion at all. Are there other opinions?
I'm definitely for having everything in MDA units, however it should be documented in the docstring if it's not already.
I'll sum up what has been discussed so far here to facilitate the discussion.
Current behaviour as of 2.2.0 is that mass densities are returned in units of g/cm^3, charge densities as (e * mol) / cm^3. This is documented in the docstring. These units are obtained by taking the raw masses in amu and charges in e, dividing by the slice volume in A^3 for initial densities, and applying a factor of Avogadro's number and the volume conversion from A^3 to cm^3 as follows:
def _conclude(self):
avogadro = constants["N_Avogadro"] # unit: mol^{-1}
volume_conversion = 1e-24 # unit: cm^3/A^3 <-- I just noticed the units are flipped in the code currently, made a mistake there, sorry! This here is how it should be. The numbers are correct though.
# divide result values by avogadro and convert from A3 to cm3
k = avogadro * volume_conversion
...
for dim in ['x', 'y', 'z']:
# norming factor, units of mol^-1 cm^3
norm = k * self.results[dim]['slice_volume'] # slice volume in A^3
for key in self.keys:
self.results[dim][key] /= norm
@orbeckst said in https://github.com/MDAnalysis/mdanalysis/issues/2507#issuecomment-583634608 that in his opinion, the current units are sensible for mass densities, but charge densities should be in MDA standard units of e / A^3. This can be achieved here by simply not applying the factor of k
for charge densities as such:
for dim in ['x', 'y', 'z']:
self.results[dim][key] /= self.results[dim]['slice_volume'] # raw densities, amu/A^3 and e/A^3
for key in ["mass_density", "mass_density_stddev"]: # convert to g/cm^3
self.results[dim][key] /= k
If the mass densities should be in MDA standard units of amu/A^3 as well (as @PicoCentauri, @IAlibay and I believe @lilyminium suggest), then the division by k
can just be omitted.
For comparison, MDAnalysis.analysis.density
returns densities with default units of A^(-3) but offers methods for converting to other units of length and density.
Once the decision of which units should be returned is made, the change (deprecation?) should be reasonably straightforward
+1 for offering a conversion method. This could also live in units.py to be applicable in general.
I like to push this also from a downstream perspective. At MAICoS we are in the process of converting all units to MDA's standards. A general conversion method would be really nice since we are used to units like kg/m^3...
Is your feature request related to a problem? Please describe.
This mailing list issue is a good use case of Linear Density but is foiled by the class not supporting UpdatingAtomGroups.
It is not intuitive to me that
LinearDensity.results['x']['pos']
refers to the mass-weighted density in g/cm^3 of the selection.pos
makes me think of coordinates.char
makes me think of characters.If the class saves the bins from
np.histogram
, it would be far easier to take advantage of existing histogram plotting tools. Also, usingnp.histogramdd
seems easier.The output is just an array of floats and the documentation doesn't say which units the density is in. These are not the normal units of MDAnalysis (amu for mass, Angstrom for length, e for charge) so they need special documentation. Also, the units written in the header of the file output are inaccurate for the charge. See #2507 for more
Describe the solution you'd like
Move the code below to
_single_frame()
so it updates.Call them
.mass_density
and.charge_density
in their own NumPy arrays. Have the results as Numpy arrays instead of dictionaries.Save the bins
Not change the units.
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Happy to do this at some point, if people are ok with breaking scripts that have coded the original
.results['x']['pos']
indexing into them