Rfam / rfam-production

Rfam production pipeline
Apache License 2.0
5 stars 3 forks source link

Update R-scape in codon #169

Closed emmaco closed 9 months ago

emmaco commented 9 months ago

We now have R-scape 2.0, which contains the Rfam option, we may need to update in codon the version that we are now using

Here is the link to download version 2.0

https://www.dropbox.com/s/iihk6c1hqanco2d/Rfam_scan_GRCh38.tar.gz?dl=0

These are some notes related to this Rscape version

============================================================

Hello Nancy and Blake,

I just added R-scape v2.0.0.0d to dropbox and it should be available for you to download.

It includes a new version —Rfam, such that when you run it together with —cacofold, the CaCoFold structure is trimmed in a way compatible with a secondary structure that Rfam can use to create a cm model. Some details are in the documentation page 30.


--Rfam

This option is meant to be used by the Rfam curators when using CaCoFold to propose improved consensus structures for an Rfam family. It removes covariations that cannot be taking into account by the Rfam models. Thus, maybe missing important covariation information that is not compatible with RNA secondary structure. Using options--cacofold --Rfam,theCaCoFoldstructureistrimmedsuchthat:

• Base pairs have to have at least 3 nucleotides of separation (covarying pairs removed if they don’t).

• Overlaps between helices are trimmed down if possible without removing any covarying pair.

• Pseudoknots (pk) are kept, but alternative motifs identified as: triplets (tr), cross (xcov), or side (scov) covaria- tions are removed.

• Base pairs which appear to be non WC (defined by the observed frequency in the alignment of the pair being A:U, U:A, C:G, G:C, G:U or U:G being less than 0.3) are removed, even if they covary.