Box module: Calculate image vectors from trajectory

amayank-umich commented 2 years ago

Is your feature request related to a problem? Please describe. HOOMD calculates and provides image data but I am observing that many simulation software doesn't provide the image data while they still employ periodic boundaries. For a lot of freud analysis, this sort of creates a bottleneck for non-hoomd simulations. So it will be nice to have a subroutine that calculates the image vectors using just the position data.

Describe the solution you'd like The algorithm runs through particle positions framewise. For each particle, it checks if the particle is suddenly jumping a huge distance. If yes, then image is calculated accordingly.

Python code I am using currently

def find_box_images(positions, unitcell_lengths):
    """
    Go frame by frame from the start and identify particle box images
    by the sudden shift in coordinate
    """
    num_frames = positions.shape[0]
    images = np.zeros((positions.shape[0], positions.shape[1], 3))
    for f in range(1,num_frames):
        Lx, Ly, Lz = unitcell_lengths[f]

        images[f] = np.copy(images[f-1])

        filtr = (positions[f,:,0] - positions[f-1,:,0]) < -Lx/2
        images[f,filtr,0] = images[f,filtr,0] + 1        
        filtr = (positions[f,:,0] - positions[f-1,:,0]) > Lx/2
        images[f,filtr,0] = images[f,filtr,0] - 1

        filtr = positions[f,:,1] - positions[f-1,:,1] < -Ly/2
        images[f,filtr,1] = images[f,filtr,1] + 1
        filtr = positions[f,:,1] - positions[f-1,:,1] > Ly/2        
        images[f,filtr,1] = images[f,filtr,1] - 1

        filtr = positions[f,:,2] - positions[f-1,:,2] < -Lz/2
        images[f,filtr,2] = images[f,filtr,2] + 1
        filtr = positions[f,:,2] - positions[f-1,:,2] > Lz/2
        images[f,filtr,2] = images[f,filtr,2] - 1

    return images

bdice commented 2 years ago

This is a frequently requested feature. I think it could be in scope for freud, but I would strongly recommend implementing it in C++ if it is included in the package, since all of the existing box methods call through to C++.

tommy-waltmann commented 2 years ago

Mayank,

Have a look at the get_images method of the box class in freud. It takes a (N, 3) size array of positions instead of a 3D array like in your example, but I think some calls to numpy.reshape before and after calling the get_images method should get you what you need.

If that doesn't suit your needs, we can talk more about why.

vyasr commented 2 years ago

Box.get_images takes a set of unwrapped vectors. That means that the image data must already be included. For instance, in a 10x10 box you could have a vector (90,90,0) and get back (9,9,0). What we need here is something that infers images from jumps in particle positions that always stay within the bounds of the box. For example, if a position goes from (9.9,9.9,0) to (0. 1,0.1,0) between two frames, you can guess that it crossed a boundary.

This is definitely a commonly requested feature. I've always questioned whether it is in scope for freud because there is no way to guarantee correctness. It relies on the trajectory being dumped frequently enough relative to the time scale to avoid significant jumps, which isn't always the case. Other freud algorithms can also produce incorrect results in the sense that a poorly sampled trajectory will give invalid RDFs etc, but those will be statistically incorrect as opposed to the calculation being just wrong. Of course, we could just take the point of view that garbage in is garbage out and push the responsibility to the user, but this API will have sharper edges than others in that respect. Just some thoughts to consider when deciding how to proceed.

bdice commented 2 years ago

@vyasr and I have discussed this before. I’m on the same page about caution near “sharp edges.” I think it would be fine if the limits to the method are documented.

I would suggest an API like Box.reconstruct_images(points [shape (N_frames, N_particles, 3)]). I think the word “reconstruct” makes it clear that the data must be sufficiently complete, in some sense. Look at the MSD module for an example of handling this shape of array, which is uncommon in freud.

joaander commented 2 years ago

Development effort would be better spent updating tools that fail to provide the necessary image data rather than implementing a reconstruction feature in freud that will almost always produce incorrect results.

vyasr commented 2 years ago

@amayank-umich I'm curious, what simulation software are you using? In principle I agree with @joaander that we'd be better off getting all tools to provide proper image data, but I don't know if that's realistic in practice. To my knowledge, the landscape is mixed, but of the tools that I've used before the only ones that provide the data are LAMMPS, which allows dumping image data, and GROMACS, which supports dumping unwrapped positions. I assume that most tools would at least support dumping unwrapped positions?

amayank-umich commented 2 years ago

Sorry for the delayed response. I haven't been checking this account regularly.

@Vyas Ramasubramani @.***> I am using GROMACS but since I am dumping large quantities of data, I am not separately dumping unwrapped positions to conserve space and I do need the wrapped positions. However, it's not just the particle simulation software. A lot of people write their own particle style models and although they should, they don't necessarily dump images. Another case is when people track particles or custom blobs in a time series of experiment images. As a particle analysis software, I feel measuring particle jumps across box boundaries should be somewhere in it.

@Bradley Dice @.> @Vyas Ramasubramani @.> I concur with everything you guys said. I think our comfort level also depends on the semantics. If we call it getting box images, then we start thinking about the correctness and space/time resolution of the original simulation, but if we just think of it as measuring particle jumps across periodic boundaries, then getting the space/time resolution right is on the user.

@Bradley Dice @.***> Sorry Bradley, I don't have the C++ implementation. C++ is not my forte.

On Thu, Nov 11, 2021 at 1:03 AM Vyas Ramasubramani @.***> wrote:

@amayank-umich https://github.com/amayank-umich I'm curious, what simulation software are you using? In principle I agree with @joaander https://github.com/joaander that we'd be better off getting all tools to provide proper image data, but I don't know if that's realistic in practice. To my knowledge, the landscape is mixed, but of the tools that I've used before the only ones that provide the data are LAMMPS, which allows dumping image data, and GROMACS, which supports dumping unwrapped positions. I assume that most tools would at least support dumping unwrapped positions?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/glotzerlab/freud/issues/854#issuecomment-966042090, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJK6GTUEUW73D7VAO74OQDULNTDHANCNFSM5GPGWU3Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

glotzerlab / freud

Box module: Calculate image vectors from trajectory #854