Closed pgbarletta closed 7 years ago
It came handy to let visualization programs know they were reading a trajectory:
I initially used the END
record, because this is what VMD does for trajectories in PDB format. I don't know about the MODEL record, is it intended for use with trajectories ?
Do you have an example of a visualization program supporting the MODEL record, but not multiple END record? As long as this change does not break VMD (which is widely used in the biomolecular community), I am fine with it!
If we implement this for writing PDB, we should also add support for it when reading PDB files.
The idea would be to replace
CRYST1 ...
ATOM ...
END
CRYST1 ...
ATOM ...
END
CRYST1 ...
ATOM ...
END
by something like this:
MODEL 1
CRYST1 ...
ATOM ...
ENDMDL
MODEL 2
CRYST1 ...
ATOM ...
ENDMDL
MODEL 3
CRYST1 ...
ATOM ...
ENDMDL
Am I right ?
If so, we don't need to pass an additional modeln parameter: we can store it in the PDBFormat class itself (it is called step in some other formats), or use the Frame::step
value for this.
The result is a PDB format without an END line, but that doesn't cause any trouble, and if necessary, it's easy to fix.
Yeah, we could add a Format::finalize
function and reflect it to Trajectory
if needed. Or just call it from the Trajectory destructor.
I initially used the END record, because this is what VMD does for trajectories in PDB format. I don't know about the MODEL record, is it intended for use with trajectories ?
It was meant for RMN experiments that sampled different conformations. That's why they call it MODEL
instead of, for example STEP
. They see it as alternative structures instead of dynamics.
Do you have an example of a visualization program supporting the MODEL record, but not multiple END record?
Yes, Pymol stops reading as soon as it reads an END
line.
As long as this change does not break VMD (which is widely used in the biomolecular community), I am fine with it!
Yeap. I'm reading these files with both VMD and Pymol.
Good then, I'll start looking into it.
Is this fixed or do we need to do something else here?
Yeap. Done.
Sorry, found something new.
With the last mods, Chemfiles only recognizes trajectories if there's an END
record after each frame. That's why the test works, but a properly formatted PDB file (with ENDMDL
records aftear each frame and only one END
record at the end of the file) with numerous frames is read as a 1 frame file.
As a consequence, Chemfiles can write a whole trajectory in a PDB file and read that same file and only get the first frame.
I can't find the bug in the source code.
As an example: https://gist.github.com/pgbarletta/fb63b8263ca898914af3ba409bcaf90d
I've been using a modified write function for a while. It came handy to let visualization programs know they were reading a trajectory:
void PDBFormat::write_model(const Frame& frame, const size_t model_n);
The only difference being that each time the first and last lines would be:
fmt::print(*file_, "MODEL {:4d}\n", model_n);
fmt::print(*file_, "ENDMDL\n\n");
The result is a PDB format without an
END
line, but that doesn't cause any trouble, and if necessary, it's easy to fix. I know it's a scrappy implementation, but I think it's a useful feature. To make it nicer, I was thinking of overloading thewrite
function for the PDB format:void PDBFormat::writel(const Trajectory& trajectory);
What do you think @Luthaf?