Closed deprekate closed 2 years ago
Hey there and thanks for pointing this out!
I poked around and I think that it's from a diff in the underlying energy functions and penalties applied to multibranching/bifurcations:
$ seqfold GGGAGGTCGTTACATCTGGGTAACACCGGTACTGATCCGGTGACCTCCC --celcius 32 --dot-bracket --sub-structures
GGGAGGTCGTTACATCTGGGTAACACCGGTACTGATCCGGTGACCTCCC
((((((((.((((......))))..((((.......)))).))))))))
i j ddg description
0 48 -1.9 STACK:GG/CC
1 47 -1.9 STACK:GG/CC
2 46 -1.4 STACK:GA/CT
3 45 -1.4 STACK:AG/TC
4 44 -1.9 STACK:GG/CC
5 43 -1.6 STACK:GT/CA
6 42 -1.4 STACK:TC/AG
7 41 -0.5 BIFURCATION:4n/3h
9 22 -1.1 STACK:TT/AA
10 21 -0.7 STACK:TA/AT
At the BIFURCATION
line it's saying that in its minimum free energy structure, it's seeing 4 free nucleotides as being the best choice. Working backwards, that came from this paper: https://pubmed.ncbi.nlm.nih.gov/15139820/
DNA_MULTIBRANCH: MultiBranch = (2.6, 0.2, 0.2, 2.0)
"""a, b, c, d in a linear multi-branch energy change function.
Inferred from:
Supplemental Material: Annu.Rev.Biophs.Biomol.Struct.33:415-40
doi: 10.1146/annurev.biophys.32.110601.141800
The Thermodynamics of DNA Structural Motifs
SantaLucia and Hicks, 2004
"""
In other words the energy functions and algo I adapted from those papers applies a greater penalty to bulge inbetween the branches than RNAFold:
I'm not saying that seqfold
is right or that seqfold
is wrong, it's just got a slightly diff approach, esp with regard to the energy penalties. I purposefully did not look at the RNAseq energies (or source code) to avoid running afoul of its license. This is all adapted from the public stuff. There's a chance seqfold
is right and RNASeq is wrong... but in this particular case they're more similar that usual, I'd say
I hope this helps for context. I'm thinking it might help folks if I write up a pinned issue summarizing the list of differences that I'm aware, of which this is one
Also I appreciate this:
Hi, I really like the package. A good alternative to Vienna (which isn't on pypi).
I hope the package remains useful
Hi, I really like the package. A good alternative to Vienna (which isn't on pypi).
But when verifing your example sequence output against Vienna's RNAfold, it looks like seqfold misses two pair bonds? the lowercased g-c and a-u