Amber-MD / cpptraj

Biomolecular simulation trajectory/data analysis.
Other
137 stars 64 forks source link

autoimage does not image familiar when no box info in parm #171

Closed swails closed 8 years ago

swails commented 8 years ago

At least that's what I suspect. This is a reminder for myself to look at it more in-depth. Here's what I observed: I have a PDB file serving as my topology:

$ cpptraj -p 2igd_solvated.pdb -y 2igd_solvated.md2.nc 

CPPTRAJ: Trajectory Analysis. V16.00b
    ___  ___  ___  ___
     | \/ | \/ | \/ | 
    _|_/\_|_/\_|_/\_|_

| Date/time: 11/09/15 16:14:27
| Available memory: 5565.2 MB

    Reading '2igd_solvated.pdb' as PDB File
Warning: PDB line length is short (67 chars, expected 80).
    Determining bond info from distances.
Warning: 2igd_solvated.pdb: Determining default bond distances from element types.
    Reading '2igd_solvated.md2.nc' as Amber NetCDF
Warning: Trajectory box type is 'Non-orthogonal' but topology box type is 'None'.
Warning: Setting topology box information from trajectory.
    Loading previous history from log 'cpptraj.log'
> list

INPUT TRAJECTORIES:
 0: '2igd_solvated.md2.nc' is a NetCDF AMBER trajectory, Parm 2igd_solvated.pdb (Non-orthogonal box) (reading 506 of 506)
  Coordinate processing will occur on 506 frames.

PARAMETER FILES:
 0: 2igd_solvated.pdb, 21180 atoms, 6840 res, box: Non-orthogonal, 6780 mol, 6737 solvent

DATASETS:
  1 data set:
    2igd_solvated.pdb "2igd_solvated.pdb" (topology), size is 21180 2igd_solvated.pdb, 21180 atoms, 6840 res, box: Non-orthogonal, 6780 mol, 6737 solvent
>          

Not sure why it says box type is None for topology in warning, but Non-orthogonal when queried via list. Also, if I use autoimage and trajout the result, I get the triclinic imaging rather than truncated octahedron which indicates "familiar" is turned off.

I can investigate further if need be, but I suspect that "familiar" should get set from a trajectory unit cell rather than a topology one (or, rather, the unit cell shape from the reference structure rather than the topology). Of course, I can turn familiar on by hand so this is a low-priority thing as the workaround is trivial.

drroe commented 8 years ago

This is strange; recent updates to cpptraj have made it so that box info in the Topology class is almost never used (it's only needed for Amber ASCII trajectories without angles and for writing BOX_DIMENSIONS to Amber topology files). The real problem here is this:

0: '2igd_solvated.md2.nc' is a NetCDF AMBER trajectory, Parm 2igd_solvated.pdb (Non-orthogonal box) (reading 506 of 506)

For some reason the box is not being detected as a truncated octahedron. I did a quick test to make sure truncated octahedron box detection is working OK when using a PDB as a topology and it seems fine (converted tz2.truncoct.parm7 and first frame of tz2.truncoct.nc from the test/ subdirectory to PDB and then used that with the NetCDF trajectory):

> list

INPUT TRAJECTORIES:
 0: 'tz2.truncoct.nc' is a NetCDF AMBER trajectory, Parm temp.pdb (Trunc. Oct. box) (reading 10 of 10)
  Coordinate processing will occur on 10 frames.

PARAMETER FILES:
 0: temp.pdb, 5827 atoms, 1882 res, box: Trunc. Oct., 1870 mol, 1869 solvent

What is the output of ncdump -v cell_angles 2igd_solvated.md2.nc? It should be something like:

 cell_angles =
  109.471219, 109.471219, 109.471219,
  109.471219, 109.471219, 109.471219,

etc

swails commented 8 years ago

This is what I get:

 cell_angles =
  109.47, 109.47, 109.47,
  109.47, 109.47, 109.47,
  109.47, 109.47, 109.47,
  109.47, 109.47, 109.47,

The reason it's truncated is because the angles were taken from a PDB file (which has 7.2F precision). I knew why cpptraj did not recognize the box as trunc. oct., and there's nothing wrong with that, but I was under the impression that familiar was turned on automatically for any non-orthogonal cell (rather than just for truncated octahedron). Am I wrong there?

drroe commented 8 years ago

Since truncated octahedron is a specific kind of a non-orthogonal cell, 'familiar' in autoimage is only automatically turned on for that, and for every other non-orthogonal cell you just do the general triclinic imaging. The intent of the automatic 'familiar' is just to preserve the look of the input coordinates (so if it comes in looking trunc. oct. it gets written out trunc. oct.).

This seems to just be a case of the truncated octahedron detection in Box.h perhaps not being "flexible" enough. Right now it's very simple:

static inline bool IsTruncOct(double angle) {
  if (angle > 109.47 && angle < 109.48) return true;
  return false;
}

Maybe this should be changed so the boundaries are TRUNCOCTBETA +/- some constant (like 0.002) or change the > to >= (even though using floating point equivalence is frowned upon).

drroe commented 8 years ago

or change the > to >= (even though using floating point equivalence is frowned upon).

Or maybe just change > 109.47 to > 109.469999 or something...

swails commented 8 years ago

The sensible thing is to compare to some delta I think. .015 is probably a good pick here seeing as pdbs are limited to 2 decimals in the CRYST1 field.

drroe commented 8 years ago

015 is probably a good pick here seeing as pdbs are limited to 2 decimals in the CRYST1 field.

I think I will make it even tighter than that. For a perfect truncated octahedron 2 decimals mean that the value will be truncated down to 109.47 (as is the case in your system). So if I just do +/- 0.001220 the range will be 109.469999 to 109.472439 which is still quite tight while detecting truncated values. Furthermore, when the box detected is truncated oct. then I will replace the angles with "perfect" values, so that when writing to formats with higher precision (like Amber restart) you won't be hampered by poor values (this is what PTRAJ does I believe). Thoughts?

swails commented 8 years ago

:+1:

drroe commented 8 years ago

OK, so everything seems to work well, but there are numerical differences I'd like to get your opinion on. Specifically, in cpptraj the angle for a truncated octahedron comes out to be 109.471221, whereas in structures from leap the value is 109.471219. I think this is because leap uses a single precision value for PI (3.1415927) while cpptraj uses full double precision (3.141592653589793). This has small effects on things like imaging, which are noticeable in higher precision trajectory formats like NetCDF. For example, here are some diffs from an Amber topology and NetCDF file due to this change:

  reorder.HOH_wat.parm7.save reorder.HOH_wat.parm7 are different.
65391c65391
<   1.09471219E+02  7.10350000E+01  7.10350000E+01  7.10350000E+01
---
>   1.09471221E+02  7.10350000E+01  7.10350000E+01  7.10350000E+01
  nc0 nc1 are different.
5350,5351c5350,5351
<   29.84577, 50.79215, 68.50934,
<   28.1801, -2.705294, 45.74434,
---
>   29.84577, 50.79214, 68.50933,
>   28.1801, -2.705297, 45.74434,

My inclination is to just go ahead and make the change, especially since it will be very good when reading in from lower precision formats, but I'd appreciate any feedback.

swails commented 8 years ago

I'm actually starting to rethink this. Here's my situation:

I built a solvent box in tleap, and wrote a PDB file because, well, that's the only file that every program out there knows how to run with. The CRYST1 record was written with the 109.47 values for unit cell angles, since that's what fits in the space allotted to it.

However, I took that box and ran with it in OpenMM. Which means that I'm not really running a true truncated octahedron -- I'm running something that's very close. If cpptraj implicitly changes this, then cpptraj will not be computing images the same way that OpenMM does.

Same thing applies for cpptraj and sander -- unless cpptraj uses exactly the same values for the angles that sander and pmemd do, its imaging will be ever so slightly different (and that difference I would expect to be magnified when wrapping or unwrapping a very long trajectory).

In this case, cpptraj is best off not guessing.

But on the flipside, if someone runs a calculation with high-precision input (like Amber inpcrd/restart) and generates a low-precision trajectory (like PDB), then the box that you read in from the trajectory will be wrong compared to what's really being used by the simulation package. In which case the "correct" thing to do would be to use the truncated octahedron dimensions. But that said, the coordinates are probably equally low-precision, so perhaps that's not terribly important...

So things I think we need to quantify before we decide what to do:

Take a very long wrapped trajectory (microsecond-scale) with a truncated octahedron. Unwrap it with the exact box dimensions defined by the trajectory (NetCDF preferably). Then wrap it setting the angles to 109.47. What is the net distance of each atom wrt its "new" position in this re-wrapped trajectory? This will give us an estimate of what kind of uncertainty range we're dealing with here. If it's even on the order of an Angstrom, I think that's bad.

swails commented 8 years ago

I don't think either of the scenarios I described is "rare". My issue is that the first one does not really have a workaround -- if your box is close to a trunc. oct., cpptraj changes it and nothing can be done about it.

In the latter situation, the workaround is to use a less-stupid trajectory format with higher precision.

drroe commented 8 years ago

If cpptraj implicitly changes this, then cpptraj will not be computing images the same way that OpenMM does.

Yeah, I thought about this. My tentative solution was to only change to the internal truncated octahedron angle if the delta from the "real" truncated octahedron angle is greater than some epsilon.

swails commented 8 years ago

I think my vote on this is to leave it alone. Maybe make recognizing truncated octahedron more flexible, but don't substitute in cpptraj's calculated t.o. angles if it is recognized.

What do you think?

drroe commented 8 years ago

I think I've been making this more complicated than it needs to be. I agree with you that cpptraj should not be in the business of automatically "fixing" things like unit cells - what you feed into cpptraj is what you get. Also, there is already a mechanism for "fixing" the angles if you want, the box command. So what I'm going to do is fix up box so that low precision truncated octahedron angles can be detected but print a warning informing users that they have the option of using higher-precision angles via the box command. Seems like the best of all worlds.

swails commented 8 years ago

Sounds good to me

drroe commented 8 years ago

Should be addressed by #173