mmagnus / rna-tools

🔧rna-tools: a toolbox to analyze sequences, structures and simulations of RNA (and more) used by RNA CASP, RNA PUZZLES, and me ;-) docs @ http://rna-tools.rtfd.io web @ http://rna-tools.online
http://rna-tools.online
GNU General Public License v3.0
152 stars 43 forks source link

--rpr function causes signififant change in the coordinates #143

Closed akashbahai closed 1 year ago

akashbahai commented 1 year ago

Hi, I noticed that if a PDB file has a few missing atoms and then one uses --rpr function to get the file ready for rna-puzzles then the resultant PDB file has significant differences from the original.

Example: Here, I have a pdb file old.pdb, and then I used rna_pdb_tools to create a new old_rpr.pdb file. Now if one calculates the rmsd (rna_calc_rmsd.py --model-selection A:2-50 --target-selection A:2-50 -t old.pdb old_rpr.pdb) between these two files, it is 3.38. Is this by design?

I came across this problem, when I was calculating the rmsd of native against raw model files and rpr-processed model files. For most cases, the rmsd is same, but in cases of missing atoms in the model file, the processed rpr files are different so rmsd is also different (and the differences can be significant).

old_rpr.txt old.txt

mmagnus commented 1 year ago

Hey, I would say RMSD can be different by design, rna_calc_rmsd.py, reads atoms in given order, so if you rpr your structure and standardzie the atom order, so if the atom order in the original file is different, then the rmsd will be non zero because different atoms will be used for calculations.

The best way is to always rpr both files to make sure that you have the exact order of atoms (I should maybe do it automatically inside to the rna_calc_rmsd.py tool, and add an option not to rpr models automatically, I've been thinking about it)

akashbahai commented 1 year ago

Thanks for the clarification. This explains the discrepancies.