charnley / rmsd

Calculate Root-mean-square deviation (RMSD) of two molecules, using rotation, in xyz or pdb format
BSD 2-Clause "Simplified" License
495 stars 114 forks source link

RMSD result using --reorder much higher than expected #100

Closed cassguff closed 12 months ago

cassguff commented 12 months ago

I have two .xyz files of the trans,trans,trans conformer of n-hexane, but the atoms are in a different order. I would expect the RMSD to be far less than 1 (probably fairly close to 0), but calculate_rmsd --reorder --reorder-method hungarian file1.xyz file2.xyz gives me around 1.7. I've also tried calculate_rmsd --reorder --reorder-method distance file1.xyz file2.xyz and it gave me an even worse answer of about 3.3. Any ideas what may be going on?

For reference, this is file1:

20

C 3.207613 0.208051 -0.000003
H 3.294035 0.848309 -0.881354
H 3.294038 0.848312 0.881345
H 4.059423 -0.474618 -0.000004
C 0.665516 0.373069 0.000002
H 0.711781 1.031147 0.875246
H 0.711781 1.031147 -0.875241
C -0.665516 -0.373069 0.000001
H -0.711782 -1.031148 0.875244
H -0.711781 -1.031147 -0.875242
C -1.882521 0.548721 0.000001
H -1.835632 1.205545 -0.874835
H -1.835634 1.205541 0.874841
C -3.207613 -0.208051 -0.000003
H -3.294038 -0.848314 0.881345
H -4.059423 0.474618 -0.000001
H -3.294036 -0.848308 -0.881354
C 1.882521 -0.548721 0.000002
H 1.835631 -1.205545 -0.874835
H 1.835635 -1.205541 0.874842

And this is file2:

20

C 1.410084 2.888540 0.000000
C 1.410084 1.362562 0.000000
C 0.006090 0.762927 0.000000
C -0.006090 -0.762927 0.000000
C -1.410084 -1.362562 0.000000
C -1.410084 -2.888540 0.000000
H -0.896957 -3.281117 0.881349
H -2.425316 -3.289695 0.000000
H -0.896957 -3.281117 -0.881349
H -1.957163 -0.996068 -0.874837
H -1.957163 -0.996068 0.874837
H 0.542412 -1.129477 -0.875242
H 0.542412 -1.129477 0.875242
H -0.542412 1.129477 -0.875242
H -0.542412 1.129477 0.875242
H 1.957163 0.996068 -0.874837
H 1.957163 0.996068 0.874837
H 0.896957 3.281117 0.881349
H 2.425316 3.289695 0.000000
H 0.896957 3.281117 -0.881349
nbehrnd commented 12 months ago

@cassguff Neither one of the two approaches engaged reached the anticipated global minimum, i.e. an even lower RMSD.

The implementation allows to retain the coordinates of the aligned second model in the coordinate system of the first one, e.g.

python calculate_rmsd.py --reorder model1.xyz model2.xyz -p > model2_approach01.xyz

The then simultaneous display of model1.xyz and new e.g. model2_approach01.xyz yields

approach01

for your first approach, and

approach02

for your second one. The other reorder methods do not reveal a substantial improvement yet (brute either takes really much longer, or faces a problem).

Input pattern in Jmol's console was line of:

load files "model01.xyz" "model2_approach01.xyz";
model 0;
select 1.1;
color palegreen;
write "approach01.png";

2023-10-05T10.53.11_report.zip

nbehrnd commented 12 months ago

@cassguff Without an anchor on a specific bond (or set of reference atoms), other programs equally reach an optimum which does not really appears to be the global one. The request to align the two model data by Jmol's rotate and translate for example only yields an RMSD of 3.37:

ezgif com-optimize

The replication with the current version of Jmol (16.1.41, 2023-09-15) yields the same RMSD and virtually same alignment.

charnley commented 12 months ago

@cassguff , unfortunately for good reordering of atoms the molecule requires to be pre-aligned, which is a chicken-and-egg-problem. You need molecule structured aligned to assign atoms, but you need atoms assigned for good rotation.

For your example structure, if you do it without Hydrogens you get a good alignment. Following command returns zero (10^-5).

calculate_rmsd file1.xyz file2.xyz --reorder --no-hydrogens

Alternatively you will need to use something like the --reorder-method qml method from http://www.qmlcode.org/installation.html, as those atom descriptors are independent of atom position.

benjfitz commented 12 months ago

I realize this is closed, one thought came to mind after pondering the coarse-alignment issue noted above. Would it help to use the rotational constants to perform a coarse alignment (I realize this won't help much for molecules with nearly degenerate constants) with the ability to rotate by 180 degrees?

charnley commented 12 months ago

@benjfitz , There is the reorder-method "inertia-hungarian" which first aligns the structures based on inertia moment, and then does the reordering. But in this case, it doesn't change anything. But it is food-for-thought on what to do with this chicken-and-egg problem.