Closed kain88-de closed 6 years ago
The headache I can see with this is for trajectory writers that require more than the Reader contains, (eg PDBWriter
will want resnames, DCDReader
doesn't have these). To make this work (with identical API) for all cases, Universe
should be passed around as the 'thing' to be written. I'm fairly certain that Readers don't have a link to the Universe currently, so this will need to be added (in a way that doesn't cause circular reference problems with deleting).
Well we would also write a more generic write/copy function in the MDAnalysis namespace that wraps MDAnalysis.Writer
def write(ag, fname):
with mda.Writer(fname, n_atoms=ag.n_atoms) as w:
for ts in ag.universe.trajectory:
w.write(ag)
# call it with
mda.write(mda.Universe(PSF, DCD), 'test.pdb))
That is easier to implement and would allow to write sub selections.
To be more specific and avoid confusions we could also name the generic function write_trajectory
.
mda.write()
seems like a cheap and rather useful function.
I still like the concise semantics of u.trajectory.write("new.xtc")
. I haven't really thought it through but my feeling is we should avoid adding the Universe
to the Reader. We could start by just raising an error if we have insufficient information and then tell the user to use MDAnalysis.write(universe, "new.xtc")
instead. That would still be pretty clean.
We could start by just raising an error if we have insufficient information
The writers will already complain them self if a TimeStep object isn't enough with an exception. So that would just work.
I think write_trajectory
might be a better name, Writer.write
currently just does a single frame, it would be bad if trajectory.write
got put in a for ts in u.traj
loop
The way that people will commonly access this method as is u.trajectory.write()
. It really feels redundant to me to repeat "trajectory" in the method name again as u.trajectory.write_trajectory()
. I think here we can take advantage of the OO design in that it also provides semantic clarity.
(And in any case, we are already overloading AtomGroup.write()
to either write coordinates or selections.)
Oliver Beckstein email: orbeckst@gmail.com
Am Oct 20, 2016 um 3:15 schrieb Richard Gowers notifications@github.com:
I think write_trajectory might be a better name, Writer.write currently just does a single frame, it would be bad if trajectory.write got put in a for ts in u.traj loop
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
I like u.trajectory.write("new.xtc")
and agree with @orbeckst in that it's semantically clear and succinct. For a trajectory writing function in the MDA namespace, MDAnalysis.write_trajectory(universe, "new.xtc")
makes sense to me.
atoms
kwarg@richardjgowers asked what to do when a writer needs more information than available to the Reader https://github.com/MDAnalysis/mdanalysis/issues/1037#issuecomment-254771913. I suggest to add an optional kwarg atoms
to Reader.write(filename, atoms=<AtomGroup-like>)
that can be used to extract all the necessary additional information.
The behavior should be:
atoms
atoms
is provided, only write those atoms as a subselection. This will allow for a very concise way to strip trajectories of water: u.trajectory.write("protein_only.xtc", atoms=u.select_atoms("protein"))
.atoms
provided: raise a ValueError
and tell the user to supply the atomgroup or Universe
in atoms
(e.g., u.trajectory.write("new.xtc", atoms=u)
)atoms
is provided atoms.positions == Reader.ts.positions
)Using the atoms
argument to write subselections is in line with @kain88-de 's idea https://github.com/MDAnalysis/mdanalysis/issues/1037#issuecomment-254782831 .
And it would be nice if we could also have an optional progress meter #928 (quiet=False
, default quiet=True
... or verbose=False
once we solve #903).
EDIT: Changed behavior to use atoms
for subselections if provided.
I updated the original comment with the specs. Hopefully that makes it easier to implement.
Differences to what @kain88-de originally proposed
atoms
kwarg (we need it anyway in many cases)start
, stop
, step
kwargsReader.write()
and not Universe.write()
because the latter is ambiguous, especially in the context of serializing whole universes #643.I am not sure if the following is a good idea but I'll throw it out nevertheless:
u.trajectory[start:stop:step].write("new.dcd")
It would mean overloading the Generator object. Probably not a very pythonic thing to do.
Conceptually, the simplicity and readability of that slice construct seem quite pythonic to me. I really like it! Then again, I'm not very shy when it comes to hacking the python object model...
Like the
ag.write
it would be convenient to have atrajectory.write
method. This method would always write the complete trajectory with all atoms. To write only a subset the more generalWriter
still has to be used. This makes conversion of trajectory formats or concatenating trajectories very easy.It is also a good fit to write a MemoryReader back to disk.
Proposed API
All trajectory readers get a new method
Reader.write()
that allows one to write the trajectory to disk (possibly in a different format):The semantics should evoke the idea that a trajectory is considered the object to be written.
Optional slicing kwargs
kwargs to determine which frames to write
start
stop
step
To be used as
Optional
atoms
kwargWhat to do when a writer needs more information than available to the Reader https://github.com/MDAnalysis/mdanalysis/issues/1037#issuecomment-254771913 ? Add an optional kwarg
atoms
toReader.write(filename, atoms=<AtomGroup-like>)
that can be used to extract all the necessary additional information.The behavior should be:
atoms
is provided, only write those atoms as a subselection. This will allow for a very concise way to strip trajectories of water:u.trajectory.write("protein_only.xtc", atoms=u.select_atoms("protein"))
.atoms
provided: raise aValueError
and tell the user to supply the atomgroup orUniverse
inatoms
(e.g.,u.trajectory.write("new.xtc", atoms=u)
)atoms
is provided: write the trajectory for the given atoms.Using the
atoms
argument to write subselections is in line with @kain88-de 's idea https://github.com/MDAnalysis/mdanalysis/issues/1037#issuecomment-254782831 .Progress meter
And it would be nice if we could also have an optional progress meter #928 (
quiet=False
, defaultquiet=True
... orverbose=False
once we solve #903).Basic implementation
Implementation could look similar to the following (without all the detailed error checking and checking for
atoms
and no ProgressMeter):History