openmm / openmm-ml

High level API for using machine learning models in OpenMM simulations
Other
81 stars 25 forks source link

Question on PBC and how it relates to neighbour lists #88

Closed JSLJ23 closed 1 month ago

JSLJ23 commented 1 month ago

Not really an issue but more of a question on PBC and how it affects forming neighbour lists.

Taking a look at the MACEForce code I was trying to understand how the code in the _getNeighborPairs() works

neighbors, wrappedDeltas, _, _ = getNeighborPairs(
    positions, self.model.r_max, -1, cell
)
mask = neighbors >= 0
neighbors = neighbors[mask].view(2, -1)
wrappedDeltas = wrappedDeltas[mask[0], :]

edgeIndex = torch.hstack((neighbors, neighbors.flip(0))).to(torch.int64)
if cell is not None:
    deltas = positions[edgeIndex[0]] - positions[edgeIndex[1]]
    wrappedDeltas = torch.vstack((wrappedDeltas, -wrappedDeltas))
    shiftsIdx = torch.mm(deltas - wrappedDeltas, torch.linalg.inv(cell))
    shifts = torch.mm(shiftsIdx, cell)
else:
    shifts = torch.zeros((edgeIndex.shape[1], 3), dtype=self.dtype, device=positions.device)

specifically the getNeighborPairs() function from NNPOPs and what the final shifts actually represent. I understand that in the FAQ on PBC, it was mentioned that

Periodic boundary conditions are applied to displacements, not positions.

but I don't quite understand how applying it only to the displacement vectors would properly allow for the neighbour list to be built.

My questions are namely:

  1. When OpenMM is running the actual MD, are the positions of the solute (say protein for example) just moving through free space without any "wrapping"? Hence is the solute intact for the entirety of the simulation?
  2. In the context of NNPs (MACE and others), why is there the need for the wrappedDeltas if the PBC isn't applied to the positions during the simulation? (pardon this question if I am misunderstanding something regarding PBC here).
  3. What exactly are the wrappedDeltas and shifts physically representing in the context of MACE in an OpenMM simulation?

Thank you for the time taken to address my queries.

peastman commented 1 month ago

Suppose your periodic box is a cube of size 10. Consider two particles at positions (1, 0, 0) and (9, 0, 0). If you just subtract, you'll get a delta between them of (8, 0, 0), indicating they're 8 units apart. But that doesn't take periodic boundary conditions into account. It applies a shift of (-10, 0, 0), one periodic box vector, to get a much closer wrapped delta of (-2, 0, 0). They're really only 2 units apart.

When OpenMM is running the actual MD, are the positions of the solute (say protein for example) just moving through free space without any "wrapping"?

Conceptually, yes. In terms of the implementation, it depends on the platform. CUDA and OpenCL translate things into a single periodic box to keep the coordinates from getting large. That's necessary to preserve accuracy when working in single precision. But they keep track of how many boxes they've translated each coordinate by and put them back when you query the coordinates. That's an implementation detail.

JSLJ23 commented 1 month ago

But they keep track of how many boxes they've translated each coordinate by and put them back when you query the coordinates.

  1. Does this mean that when you query the coordinates, the "unwrapped" and translated solute coordinates are returned?
  2. When the TorchForce model "sees" the positions, does it see the "wrapped" single periodic box coordinates or the "unwrapped" translated coordinates?
peastman commented 1 month ago

Does this mean that when you query the coordinates, the "unwrapped" and translated solute coordinates are returned?

Yes, you will only see the unwrapped ones. The wrapping is an internal implementation detail.

When the TorchForce model "sees" the positions, does it see the "wrapped" single periodic box coordinates or the "unwrapped" translated coordinates?

That depends on the platform. But again, it's an implementation detail. The model applies periodic boundary conditions to the displacements. That comes out the same regardless of any translations applied to the input coordinates.

JSLJ23 commented 1 month ago

Suppose your periodic box is a cube of size 10. Consider two particles at positions (1, 0, 0) and (9, 0, 0). If you just subtract, you'll get a delta between them of (8, 0, 0), indicating they're 8 units apart. But that doesn't take periodic boundary conditions into account. It applies a shift of (-10, 0, 0), one periodic box vector, to get a much closer wrapped delta of (-2, 0, 0). They're really only 2 units apart.

From this example I don't quite understand how applying the PBC will give the same result regardless of whether the coordinates are wrapped.

So if in MD software A, the coordinates of the particles are described as (1, 0, 0) and (9, 0, 0), then yea it does make sense applying the (-10, 0, 0) PBC would give a displacement of (-2, 0, 0).

But in another MD software what if the particles were represented as (11, 0, 0) and (9, 0, 0) and the displacement was already (-2, 0, 0), won't applying the PBC give a different outcome?

peastman commented 1 month ago

It doesn't matter whether the positions are (11, 0, 0) and (9, 0, 0), or (131, -100, 0) and (-71, 0, 30). Once you subtract and apply periodic boundary conditions to the difference, you'll end up with an offset of (-2, 0, 0).

JSLJ23 commented 1 month ago

Could you further elaborate on the underlying criteria which decides on the extent of subtraction? For the (11, 0, 0) and (9, 0, 0) example, just taking (9, 0, 0) - (11, 0, 0) would give the intended displacement of (-2, 0, 0). But for (-71, 0, 30) - (131, -100, 0), it would give (-202, 100, -30), which would require adding 20× the periodic vector in the x direction, subtracting 10× in the y direction and adding 3× in the z direction. Is the adding and subtraction done till the magnitude of the displacement is the smallest?

peastman commented 1 month ago

For a rectangular box, the shift is simply -round(delta/width)*width. For example, -round(-202/10)*10 = 20*10 = 200. Or -round(-30/10)*10 = 3*10 = 30. For a triclinic box the calculation is slightly more complicated, but only slightly.

JSLJ23 commented 1 month ago

Super helpful, thank you! No trouble to explain the triclinic box calculation, a link to the code implementation would be great too.

peastman commented 1 month ago

This is the calculation for triclinic boxes.

JSLJ23 commented 1 month ago

Much appreciated!

JSLJ23 commented 1 month ago

@peastman I just had another question on PBC and why it does not apply to bonded interactions.

CUDA and OpenCL translate things into a single periodic box to keep the coordinates from getting large.

Periodic boundary conditions are applied only to nonbonded forces, not bonded ones.

If the implementation translates the atomic positions to a single periodic box, won't the bonds between the solute atoms potentially cross the PBC and get wrapped? Hence won't they require the PBC application?

peastman commented 1 month ago

It translates each molecule as a unit. It never breaks them up.

JSLJ23 commented 1 month ago

But they keep track of how many boxes they've translated each coordinate by and put them back when you query the coordinates.

So by this it means that the entire solute has to "cross" the periodic box before the underlying logic records that it has translated by +1 boxes?

peastman commented 1 month ago

Each molecule is translated based on the position of its center.