General way to create trajectory

anabiman commented 3 years ago

We need a general and robust way of creating an MDAnalysis trajectory object from MMSchema for TrajToMdaComponent.

Related issues:

Can trajectories in MDAnalysis handle time-dependent topologies? It seems topology is hard-wired to be static by design.
It seems Universe._trajectory is a pointer to a specific traj reader subclass in coordinates.

anabiman commented 3 years ago

@lilyminium @IAlibay

IAlibay commented 3 years ago

Can trajectories in MDAnalysis handle time-dependent topologies? It seems topology is hard-wired to be static by design.

Unfortunately not. Bonds/angles/dihedrals/grouping (i.e. residues) don't have to be set (like for XYZ inputs), but the number of atoms is static.

It seems Universe._trajectory is a pointer to a specific traj reader subclass in coordinates.

Yes that is the case, traversing through a trajectory essentially amounts to iteratively loading coordinates via the reader that the trajectory attribute is connected with (@lilyminium may be better able to expand on this).

lilyminium commented 3 years ago

Yeah that's right. Pointing to the next frame pretty much only populates the .positions attribute (and possibly velocities and forces).

anabiman commented 3 years ago

Is there any reason a general trajectory class doesn't exist in MDA? A class that can parse any specific traj file format and describes a trajectory in molecular mechanics?

lilyminium commented 3 years ago

Each format has different info and parsing needs, hence the different parser for each format. The "class that can parse any specific traj file format" role is mostly fulfilled by MDAnalysis.core.Universe, which looks for the correct parser for the file format. @orbeckst and @richardjgowers may have more insight into the historical evolution of the structure of MDAnalysis.

orbeckst commented 3 years ago

The philosophy (coming from MMTK originally) was that for a system to make sense you need static information ("topology") and dynamic information (coordinates, velocities, forces, lambdas, .... = "trajectory"). The Universe combines them. Ultimately, MDAnalysis takes the view that a trajectory is something that you add to a topology.

We don't need a generic trajectory class because trajectory "Readers" are defined by the Trajectory API (which is mostly implemented by deriving specialized readers from the coordinates.base classes.

As @lilyminium said, the decision on which Reader to choose is taken by Universe (actually, through a hacky function coordinates.core.get_reader_for() and metaclass magic that registers any Reader... but @richardjgowers needs to explain this).

I suppose one could wrap the part that selects the reader into yet another class but for MDAnalysis that does not seem to have any advantages, but I might be wrong.

Can you explain your philosophy and use case, @Andrew-AbiMansour ?

orbeckst commented 3 years ago

By the way, it is possible to create a bare-bones Universe from a trajectory alone:

import MDAnalysis as mda; from MDAnalysisTests import datafiles as data
mda.Universe(data.XTC)
mda.Universe(data.TRR)
mda.Universe(data.XYZ)
mda.Universe(data.DCD)

It will just create sequentially numbered atoms but you don't need to supply a topology if you don't have one.

Conversely, if you only have a topology, you can add the trajectory later:

u = mda.Universe(data.PSF)
u.load_new(data.DCD)

anabiman commented 3 years ago

Thanks for the info. The problem I'm trying to solve in the context of mmic_mda is converting efficiently MMElemental.Trajectory to MDanalysis.Universe that stores this trajectory in a way that is agnostic to any specific file format.

The MMelemental.Trajectory class attempts to provide a general definition of trajectory in classical mechanics, so that this class can handle both static and dynamic topologies:

class Trajectory(ProtoModel):
    top: Optional[Union[Molecule, List[Molecule]]] = Field(
        None,
        description="A single or list of  :class:``Molecule`` objects representing the molecular topology.",
    )
    frames: List[Frame] = Field(
        None, description="A list of :class:``Frame`` objects of length nframes."
    )

frames handles temporal quantities like positions, velocities, ... while top can be optionally used to load the topology.

Ideally, mmic_mda would convert MMElemental.Trajectory to MDA.Universe by instantiating an empty universe and then loading a trajectory object as you pointed out @orbeckst , but how can we do this in a way that does not require specific conversion from Trajectory to DCD, TRR, etc. readers in MDA? In other words, is there a way to create an MDA trajectory from a "generic" data object such as a python dictionary?

orbeckst commented 3 years ago

Yes, use the MemoryReader, which takes a numpy array. https://docs.mdanalysis.org/stable/documentation_pages/coordinates/memory.html

orbeckst commented 3 years ago

Maybe that wasn't super-clear but if you provide a numpy array in u.load_new(buffer) then this should automatically create an in-memory representation of the trajectory.

Is MMElemental.Trajectory an in-memory data structure or linked to a file?

IAlibay commented 3 years ago

class can handle both static and dynamic topologies

This is a little bit concerning though. If the topology changes you would have to re-create the Universe.

orbeckst commented 3 years ago

I think conversion to MDA should fail with a ValueError (or NotImplementedError) if the topology is not static. It's just not supported at the moment 😞 .

anabiman commented 3 years ago

Time-dependent/dynamic topologies not being supported in MDA is not a problem, we can just raise a NotImplementedError as @orbeckst suggested.

Is MMElemental.Trajectory an in-memory data structure or linked to a file?

Could be either. By design, MMElemental is completely agnostic to any MD file format, but for convenience, it can (in runtime) provide read/write capabilities indirectly using mmic_translator, which under the hood would select a specific translator (such as mmic_mda) to read/write a specific file format. The MemoryReader you suggested @orbeckst is exactly what I needed. Thanks!

I will keep this issue open until the traj converter is completed in mmic_mda. Thanks for your input everyone.

MolSSI / mmic_mda

General way to create trajectory #2