MDAnalysis / mdanalysis

MDAnalysis is a Python library to analyze molecular dynamics simulations.
https://mdanalysis.org
Other
1.3k stars 648 forks source link

ChainReader broken for TRZ format #4042

Open richardjgowers opened 1 year ago

richardjgowers commented 1 year ago

Something I've noticed while tinkering with ChainReader and transformations. I think this is because TRZ makes seeks relative to the current state of ts.frame, which is misleading inside a chained trajectory

Code to reproduce the behavior

import MDAnalysis as mda
from MDAnalysis.tests.datafiles import TRZ_psf, TRZ

def test_trz_iteration():
    u_ref = mda.Universe(TRZ_psf, TRZ)
    n_frames_ref = len(u_ref.trajectory)

    u = mda.Universe(TRZ_psf, [TRZ, TRZ])
    i = 0
    for ts in u.trajectory:
        i += 1

    assert i == 2 * n_frames_ref

....

raises

self = <TRZReader /home/richard/code/mdanalysis/testsuite/MDAnalysisTests/data/trzfile.trz with 6 frames of 8184 atoms>
nframes = -6

    def _seek(self, nframes):
        """Move *nframes* in the trajectory

        Note that this doens't read the trajectory (ts remains unchanged)

        .. versionadded:: 0.9.0
        """
        framesize = self._dtype.itemsize
        seeksize = framesize * nframes
        maxi_l = seeksize + 1

        if seeksize > maxi_l:
            # Workaround for seek not liking long ints
            # On python 3 this branch will never be used as we defined maxi_l
            # greater than seeksize.
            framesize = long(framesize)
            seeksize = framesize * nframes

            nreps = int(seeksize / maxi_l)  # number of max seeks we'll have to do
            rem = int(seeksize % maxi_l)  # amount leftover to do once max seeks done

            for _ in range(nreps):
                self.trzfile.seek(maxint, 1)
            self.trzfile.seek(rem, 1)
        else:
            seeksize = int(seeksize)

>           self.trzfile.seek(seeksize, 1)
E           OSError: [Errno 22] Invalid argument

../../package/MDAnalysis/coordinates/TRZ.py:351: OSError

Current version of MDAnalysis

IAlibay commented 1 year ago

honestly at this rate we should just drop the TRZ reader for v3?

richardjgowers commented 1 year ago

Yeah I doubt anyone is actually using it in anger

richardjgowers commented 1 year ago

I do think this is more symptomatic of a class of potential bug where if a Reader is relying on information stored inside Timestep they might make mistakes when inside a ChainReader, since this is breaking the assumption that nobody else is editing the Timestep. I might do a crawl through existing readers and see if this assumption is being made elsewhere.

orbeckst commented 1 year ago

The only upside is that it crashes...

(Also seeing that there's still Python 2 code inside TRZ indicates that it might not be the most looked at code...)