Lattice-Automation / seqfold

nucleic acid folding
MIT License
83 stars 12 forks source link

inf values returned from dg() on RNA sequences #18

Closed ijhoskins closed 2 months ago

ijhoskins commented 5 months ago

Hello,

I'm not sure if this is necessarily a bug, but I was hoping to gain some clarity on why inf values would be returned for some RNA sequences. Note this is not the case if the sequence is converted to DNA.

from seqfold import dg

# RNA sequences
>>> dg("CAGCGCGGCGGGCGGGAGUCCGGCGCGCCCUCCAUCCCCGGCGGCGUCGGCAAGGAGUAG", temp=37)
inf
>>> dg("ACAAGCAGGAGCCUAUAAAUCCCUGAGGGGGCUGCUGGGACAUCACAGAAGGUGAAAGUC", temp=37)
inf

# DNA sequences
>>> dg("CAGCGCGGCGGGCGGGAGTCCGGCGCGCCCTCCATCCCCGGCGGCGTCGGCAAGGAGTAG", temp=37)
-12.6
>>> dg("ACAAGCAGGAGCCTATAAATCCCTGAGGGGGCTGCTGGGACATCACAGAAGGTGAAAGTC", temp=37)
-5.3
ijhoskins commented 5 months ago

Forgot to mention I am running seqfold version 0.7.17

jjti commented 2 months ago

Hey @ijhoskins, thanks for reporting this! I just pushed a fix. This was because of an unfortunate bug seen when the outermost structure is a multibranch. I pushed a fix and made a new release (0.7.18). Eg of the new results:

$ seqfold ACAAGCAGGAGCCUAUAAAUCCCUGAGGGGGCUGCUGGGACAUCACAGAAGGUGAAAGUC --dot-bracket --sub-structures
ACAAGCAGGAGCCUAUAAAUCCCUGAGGGGGCUGCUGGGACAUCACAGAAGGUGAAAGUC
.....(((.((((.......(((...))))))).))...((.((((.....))))..)))
   i    j    ddg  description    
   5   59    1.8  BIFURCATION:3n/2h
   6   35   -2.1  STACK:AG/UC    
   7   34   -2.4  STACK:GGA/CGU  
   9   32   -2.1  STACK:AG/UC    
  10   31   -3.4  STACK:GC/CG    
  11   30   -3.3  STACK:CC/GG    
  12   29    4.6  BULGE:8        
  20   28   -3.3  STACK:CC/GG    
  21   27   -3.3  STACK:CC/GG    
  22   26    5.4  HAIRPIN:CU/GA  
  39   58   -2.2  STACK:AC/UG    
  40   57   -1.2  INTERIOR_LOOP:2/3
  42   54   -2.4  STACK:UC/AG    
  43   53   -2.1  STACK:CA/GU    
  44   52   -2.2  STACK:AC/UG    
  45   51    4.3  HAIRPIN:CA/GG  
-13.9

I'm closing as fixed, but please let me know if you see any more issues. Thanks for reporting this!