MDAnalysis / mdanalysis

MDAnalysis is a Python library to analyze molecular dynamics simulations.
https://mdanalysis.org
Other
1.32k stars 652 forks source link

Inconsistency between LAMMPS cell origin and coordinates #3741

Open PicoCentauri opened 2 years ago

PicoCentauri commented 2 years ago

Is your feature request related to a problem?

Simulation cells in MDAnalysis (MDA) are defined via the dimensions attribute defining the cell via 3 lengths and 3 angles. This implies that the cell origin is always at (0,0,0).

LAMMPS defines its simulation cells via a lower and an upper coordinate, allowing the cell origin not to be at (0,0,0). The LAMMPS approach allows for negative coordinates still in the primary unit cell.

MDA parse LAMMPS cells and converts the box vector according to

https://github.com/MDAnalysis/mdanalysis/blob/c01d82bb86b480c8b2ae87f36497e7fc0a726c84/package/MDAnalysis/topology/LAMMPSParser.py#L555-L561

shifting the cell origin to (0,0,0). However, in MDA the cell origin is moved to (0,0,0) but the coordinates are not. This is an inconsistency for example that negative coordinates are not in the primary unit cell anymore.

EDIT: I would like to give a practical example: If I have an unwrapped data file (molecules are whole) and I would like to histogram the center of masses along an axis. For the histogram I have to know the range of the box which goes from 0 to L. Whole molecules outside the box, with a negative, center of mass will not be considered even though they are in the primary unit cell in LAMMPS.

Describe the solution you'd like

Sift coordinates like the cell origin by the lower cell coordinate. See this commit for an idea.

Describe alternatives you've considered

Shift coordinates by 'hand' every time loading a LAMMPS data file with negative coordinates.

polly-code commented 2 years ago

Hi @PicoCentauri, could you please elaborate on why you need this feature for the unwrapped coordinates? If I understood correctly, you performed simulations with PBC and unwrapped the molecules' coordinates during the simulations. In this case, you cannot refer to the box size anymore. If I misunderstood, could you please provide one more practical example?

PicoCentauri commented 2 years ago

Hey @polly-code, yes I can give it a try.

Imagine the following system of a single particle type in a cubic box ranging from -0.5 Å to 0.5 Å. When reading the trajectory the cell origin is shifted to (0,0,0) whereas the positions are leaved untouched. This leads to atoms outside of the cell which are within the primary cell within the simulation.

Now I want to perform a spatial analysis. Lets say a density calculation across one dimension of the cell. For histogramming my positions I should provide the limits of my histogram. Looking at the dimensions of the cell with for example u.dimensions would return [1,1,1,90,90,90]. Following this I would provide limits ranging from (0,1). The resulting density would be correspond to the density in the simulation from 0 to 0.5 and a zero density from 0.5 to 1. This is consequence of of shifting the cell origin to 0 but not the coordinates.

Does this example help?

polly-code commented 2 years ago

Now it's clear to me, thanks! I was confused by the word unwrapped, sorry. Cool suggestion btw :)