RMSD on identical structures produces non-zero value

GoogleCodeExporter commented 9 years ago

Hi Everyone,

As discussed here 
(https://groups.google.com/forum/#!topic/mdnalysis-discussion/oS0aRuokz20) when 
calculating the RMSD value from two identical structures, I get a non-zero 
value, typically on the order of 10^-6 angstroms. A value of exactly zero is 
expected.

This can be reproduced using the rmsd(), rotation_matrix() and rms_fit_trj() 
functions from MDAnalysis 0.7.6.

Please see the attached script (kindly written by Sebastian Busch) that 
suggests one of the molecules is being scaled very slightly (leaving it no 
longer identical to the reference), possibly as a result of the pyqcprot 
function.

Comparison of RMSD values from MDAnalysis with those from Gromacs g_rms over an 
example, real-world .xtc trajectory shows virtually identical values apart from 
the first frame (which is an like-on-like fit). Subsequent frames only differ 
in the 6th decimal place (+/- 1), which is likely the result of rounding errors.

E.g. first 20 frames:

frame no    g_rms rmsd  mda rot_mat rmsd    g_rms minus mda
1       0       5.19024e-06     -5.19024E-006
2       1.139526    1.139526        0
3       1.565925    1.565925        0
4       1.578595    1.578595        0
5       1.700151    1.700151        0
6       1.865499    1.865499        0
7       1.820653    1.820653        0
8       1.885945    1.885945        0
9       1.96395     1.96395         0
10      2.043311    2.043311        0
11      2.084968    2.084968        0
12      2.070839    2.070839        0
13      1.945532    1.945531        0.000001
14      2.133647    2.133647        0
15      2.209994    2.209994        0
16      2.259724    2.259724        0
17      2.313043    2.313044        -0.000001
18      2.332786    2.332786        0
19      2.349936    2.349936        0

Thanks,
Gareth Morgan

Original issue reported on code.google.com by MorganGa...@gmail.com on 14 Nov 2012 at 12:53

Attachments:

rmsd_zero_testing.py

GoogleCodeExporter commented 9 years ago

I have been testing g_rms a little more and noticed that it too sometimes 
outputs similar non-zero values (10^-7nm = 10^-6 angstroms) when calculating 
the RMSD of identical structures.

I was calculating the RMSD of a TM helix over a trajectory relative to its 
starting position. Gromacs doesn't switch output from float to standard form 
meaning I can't tell what is happening beyond 7 significant figures (not 
normally an issue but one in this case).

Best,
Gareth

Original comment by MorganGa...@gmail.com on 14 Nov 2012 at 3:34

Added labels: ****
Removed labels: ****

GoogleCodeExporter commented 9 years ago

Hi,

I made Josh owner of this ticket.

Just out of curiosity: I understand that the RMSD of a structure to itself 
should be exactly 0 and this is what we would like to obtain ideally. However, 
1e-5 Å is a fairly small number and in all applications I can think off I'd be 
willing to count it as 0 (assuming that this is not a sign that the algorithm 
itself is fundamentally flawed, which does not seem to be the case according to 
the tests). I.e. if I calculated a RMSD of 1e-5 Å between two structures I'd 
call them identical for all practical purposes. Why are you concerned with such 
small differences?

Oliver

Original comment by orbeckst on 24 Nov 2012 at 10:04

Changed state: Accepted
Added labels: Priority-Low
Removed labels: Priority-Medium

GoogleCodeExporter commented 9 years ago

Hi Oliver,

My thinking at the time was 'if it can't produce the correct output in the one 
known case (two identical structures), can I trust it with an unknown case?'

Subsequent investigation involving g_rms and reading the Theobald papers put my 
mind a little more at ease - The QCP algorithm has a relative (to other 
methods) precision of ~1e-6, which is in line with the discrepancies I've been 
seeing.

Having investigated this I'm now happy enough to carry on using the MDAnalysis 
RMS functions. Thank you both for your help on this.

Best,
Gareth

Original comment by MorganGa...@gmail.com on 26 Nov 2012 at 2:59

Added labels: ****
Removed labels: ****

GoogleCodeExporter commented 9 years ago

Hi Gareth,

> 'if it can't produce the correct output in the one known case (two identical 
structures), can I trust it with an unknown case?'

Makes sense.

> Subsequent investigation involving g_rms and reading the Theobald papers put 
my mind a little more at ease - The QCP algorithm has a relative (to other 
methods) precision of ~1e-6, which is in line with the discrepancies I've been 
seeing.

Ok, then I'll close the ticket as 'WontFix'.

Thanks for reporting back.

Oliver

Original comment by orbeckst on 26 Nov 2012 at 4:46

Changed state: WontFix
Added labels: ****
Removed labels: ****

dimchris / mdanalysis

RMSD on identical structures produces non-zero value #119