Open taoliu032 opened 1 year ago
Hi @taoliu032, welcome to MDAnalysis and thanks for opening such a detailed issue. I suspect that the reason you're seeing this is because MDAnalysis loads data for each frame from the file as you access it. That means that any changes you make to the coordinates for a particular frame (e.g. the first frame) are then re-set after you iterate through the trajectory. The BAT.run()
(and any analysis class .run()
) does just such an iteration, so the coordinates are re-set.
There are a couple ways you can get the rotation to persist. The first is to load the trajectory completely into memory when you first create the universe:
u_example = mda.Universe(top_file, in_memory=True)
The second is to use on-the-fly transformations. These mean you don't need to load all your data into memory at once, but rather applies a transform on the coordinates of the atoms when you load each frame. These are slightly more complex -- here's an example of how I would think it would look, but the code is untested.
from MDAnalysis.transformations.rotate import rotateby
angle = 90
p = [1,2,3]
d = [0,1,2]
transform = rotateby(angle, direction=d, point=p, ag=dna_atoms)
u.trajectory.add_transformations(transform)
Hi @lilyminium, thanks a lot for the help! I tried the on-the-fly-transformation approach and it perfectly worked! Although I didn't test the first way because it is not practical for me to load the entire trajectory into memory. But I think it should work just as well :)
I didn't know that for a universe created by just a pdb (serving as both topology and coordinate files), I can still use .trajectory
attribute. I guess such a universe is considered as a trajectory with just one frame, am I correct?
A quick follow-up question about the BAT
class before we close this issue. I noticed something else that might worth attention:
Once I got the internal coordinates, I tried to convert them back to Cartesian coordinates following the example like below:
R = BAT(dna_atoms)
# Calculate BAT coordinates
R.run()
# Convert them back to Cartesian coordinates
XYZ = R.Cartesian(R.results.bat[0,:])
I found if I run the R.Cartesian()
consecutively by more than 1 time like this:
XYZ_1 = R.Cartesian(R.results.bat[0,:])
print(XYZ_1)
XYZ_2 = R.Cartesian(R.results.bat[0,:])
print(XYZ_2)
The returned XYZ_1
and XYZ_2
are different. Is there a reason for this?
I notice that in the API for Cartesian
, it says "One application of this function is to determine the new Cartesian coordinates after modifying a specific torsion angle." I was wondering if "one application" implied that it should be used only once.
Thanks again!
@taoliu032 you're right about single frame "trajectories" behaving like multiframe trajectories of length 1.
Your follow up looks like it might be that R.results.bat[0, :]
is getting modified inside R.Cartesian
. Can you check that R.results.bat[0, :]
is identical before and after calling R.Cartesian
?
@richardjgowers Thanks for the clarification!
It doesn't look like R.results.bat[0, :]
gets changed upon calling R.Cartesian
on it:
bat = R.results.bat[0,:]
print(R.results.bat[0,:])
# Reconstruct Cartesian coordinates from BAT coordinates
XYZ = R.Cartesian(R.results.bat[0,:])
# print('XYZ:', XYZ)
print(R.results.bat[0,:])
print(np.allclose(bat, R.results.bat[0,:], atol=1e-6))
[41.76300049 23.33600044 43.16699982 ... 18.54352474 2.91510914 20.59402519] [41.76300049 23.33600044 43.16699982 ... 21.25134529 2.91510914 23.50913433] True
I did some more tests on the issue that R.Cartesian(bat_frame)
does not return consistent result when run twice consecutively. I find that torsions
variable calculated in the source code is different in each run, probably because of the conversion step from any improper torsions to proper torsions, shown below:
torsions = bat_frame[2 * n_torsions + 9:]
# When appropriate, convert improper to proper torsions
shift = torsions[self._primary_torsion_indices]
shift[self._unique_primary_torsion_indices] = 0.
torsions += shift
# Wrap torsions to between -np.pi and np.pi
torsions = ((torsions + np.pi) % (2 * np.pi)) - np.pi
Is it possible that self._primary_torsion_indices
and self._unique_primary_torsion_indices
can be altered, especially if the BAT object R
contains results for a trajectory of N frames? @richardjgowers @anyone_else_who_has_thoughts :)
Ah, yes, this looks very likely a bug -- looks like line 506 is modifying the array in place. Even in your earlier comment @taoliu032, when the array gets printed, it looks different.
Ah you're right @lilyminium -- I must've been temporarily blind for not seeing the difference! haha
[41.76300049 23.33600044 43.16699982 ... 18.54352474 2.91510914 20.59402519] [41.76300049 23.33600044 43.16699982 ... 21.25134529 2.91510914 23.50913433] True
Note: I was checking the source code of release-2.0.0
@daveminh here's a potential issue with the BAT module and the question if BAT.Cartesian() should really modify the input array in place. This leads to confusing behavior and it would be safer to work on a copy. However, is there a reason why you chose to do in-place operations? (@lilyminium identified line 506 as the culprit).
Hello! This issue may seem a bit long with all the code and outputs, but it's actually just one (simple-looking) issue. Please continue to scroll down, thanks!
Expected behavior
I was testing the BAT class for converting Cartesian coordinates to internal coordinates. The latter are rotation and translation invariant. To test the invariance, I did a simple test: select an AtomGroup, run BAT to get its internal coordinates; then rotate the AtomGroup, run BAT again to get the new internal coordinates. These two sets of internal coordinates should be the same.
Actual behavior
After rotating the AtomGroup and running BAT, I found the atomic coordinates were changed back to those before rotation! I re-ran the code several times and the issue remained. So I suspect there might be a bug to the
BAT
class. Please check the code and outputs below.Code to reproduce the behavior
As expected, the atomic coordinates stay unchanged (as they should) after the first BAT operation. Then I rotate the molecule by an arbitrary angle, eg, 90 degrees:
As expected, the atomic coordinates are different after the rotation, from the original coordinates. Then I want to convert the new coordinates into internal coordinates:
For some reason, after doing the second BAT operation, the atomic coordinates are changed back to those before applying rotation??? I don't understand what was going on... The API page for BAT class does not mention that this function alters the input AtomGroup. So please help if you have an idea of what the culprit(s) might be! Thanks a lot!!
Current version of MDAnalysis