nglviewer / ngl

WebGL protein viewer
http://nglviewer.org/ngl/
MIT License
657 stars 168 forks source link

ngl/src/store /residue-type.ts isRna() issue #1016

Open Fravadona opened 5 months ago

Fravadona commented 5 months ago

Hi,

I've got a little issue with ngl.js.

Coarse-grainied DNA chains (with only P and/or C3' atoms) work fine, but coarse-grained RNA chains (with only P and/or C4' atoms) are not displayed.

With some efforts I found out that the isRna() function of ngl/src/store /residue-type.ts has a little problem:

In the expression

this.hasAtomWithName([ 'P', "O3'", 'O3*' ], [ "C4'", 'C4*' ], [ "O2'", 'O2*', "F2'", 'F2*' ]) ||
(RnaBases.includes(this.resname) && this.hasAtomWithName([ "O2'", 'O2*', "F2'", 'F2*' ]))

the && this.hasAtomWithName([ "O2'", 'O2*', "F2'", 'F2*' ]) shouldn't be there, IMHO. Why check for the presence of specific atoms when you already failed at it and felled-back to inquire the residue name?

Removing that expression makes NGL display coarse-grained RNA chains:

  isRna () {
    if (this.chemCompType) {
      return ChemCompRna.includes(this.chemCompType)
    } else if (this.hetero === 1) {
      return false
    } else {
      return (
        this.hasAtomWithName(
          [ 'P', "O3'", 'O3*' ], [ "C4'", 'C4*' ], [ "O2'", 'O2*', "F2'", 'F2*' ]
        ) ||
        RnaBases.includes(this.resname)
      )
    }
  }

Cheers Rafael.

Fravadona commented 5 months ago

There's also a bonding problem for coarse-grained DNA/RNA chains.

For two backbone atoms to make a bond, the max distance between them is set to √48 A in the code; that isn't enough for the RNA chains of ribosomes.

I set it to √96 A in ngl.js (which works with my test-cases), but I don't think that's the right thing to do. Checking that the ChainIDs are the same for both residues and that the ResidueNo are equals by ±1 would be safer.

ppillot commented 5 months ago

In the expression

this.hasAtomWithName([ 'P', "O3'", 'O3*' ], [ "C4'", 'C4*' ], [ "O2'", 'O2*', "F2'", 'F2*' ]) ||
(RnaBases.includes(this.resname) && this.hasAtomWithName([ "O2'", 'O2*', "F2'", 'F2*' ]))

the && this.hasAtomWithName([ "O2'", 'O2*', "F2'", 'F2*' ]) shouldn't be there, IMHO. Why check for the presence of specific atoms when you already failed at it and felled-back to inquire the residue name?

Note that the code is not doing the same thing in the two passes. The first test checks for hasAtomWithName([ 'P', "O3'", 'O3*' ], [ "C4'", 'C4*' ], [ "O2'", 'O2*', "F2'", 'F2*' ]) which is like the residue contains 1 atom which can be 'P' OR "O3'" OR 'O3*' AND 1 atom which can be "C4'" OR 'C4*' AND 1 atom which can be "O2'" OR 'O2*' OR "F2'" OR 'F2*' The second predicate is a bit less restrictive as it only checks for the atoms "O2'" OR 'O2' OR "F2'" OR 'F2' So, the code is not doing twice the same test.

That being said, I don't know what was the exact rationale behind those tests (it was coded more than 6 years ago by Alex Rose who is now dedicating his efforts towards Mol). I remember that in some legacy PDB files the resnames A, T, C, G were used in DNA instead of proper DA, DT, DC, DG, which makes the test on the rna residue name insufficient. Also, there are typically multiple non-standard/modified bases in RNA. In the first part of your message you mention "only P and/or C4' atoms", so it seems that we could add another case here specifically for coarse grain models. Note that the lsit of atoms considered for RNA coarse grained models is defined as `[ "C4'", 'C4', 'P' ]`, so this seems like the direction to go.

Fravadona commented 5 months ago

Thank you for the reply.

So that condition is meant to prevent DNA chains in some older PDBs to be recognized as RNA chains?

A possible solution could be to handle those exceptions in the isDna() function:

... ||
DnaBases.includes(this.resname) ||
RnaBases.includes(this.resname) && ! this.hasAtomWithName([ "O2'", 'O2*', "F2'", 'F2*' ])

But in that case, coarse-grainded RNA chains would be evaluated as true by both isRna() and isDna().

ppillot commented 5 months ago

@Fravadona I am not sure why it was written this way. Do you have an example of a public coarse grained RNA structure we could use for reference?

Fravadona commented 5 months ago

@ppillot I'm not sure what you call a "public coarse grained RNA structure" but here's an example that works with my "patched" version of ngl.js:

http://www.dynstr.pasteur.fr/servers/minactionpath/minactionpath2_submission/examples/8G34_to_8G31/

If you load the transition.cif.gz or play the trajectory.cif.gz in http://nglviewer.org/ngl/ then the RNA chains will be missing