Lattice-Automation / seqfold

nucleic acid folding
MIT License
79 stars 12 forks source link

from typing import List (one line I had to add to python demo example in README.md) Also ct format doable? #7

Closed porteusconf closed 3 years ago

porteusconf commented 3 years ago

my own TODO

Great work! I may try to tweak to get output in .vienna format which has the mfe appened on same line as dot-brackets. I want to count bulges, and one way to do so involves converting .vienna dot-bracket to .ct (connect) format. Seems doable ;-)

Single-char Ttpo in README.md where I think seqfold -v -l output should have ddg not dg that is: i j ddg description not i j dg description BTW what does ddg stand for? May also want to mention version(s) of python 3.x required. I think typing did not start showing up until 3.5 or 3.6, right? And it works differently in 3.9 per https://stackoverflow.com/questions/57505071/nameerror-name-list-is-not-defined/

Also, one line I needed in python demo

Note I had to add one extra line to the python demo in README.md from typing import List Adding above line got rid of NameError... as shown below. I just pasted python demo from README.md into fileseqfold-demo.py file.

$ head -4  seqfold-demo.py. # after adding `from typing import List` line
from seqfold import dg, dg_cache, fold, Struct
from typing import List
# just returns minimum free energy

$ python. seqfold-demo.py

-15.200000000000001
   0   48   -2.1  STACK:GG/CC    
   1   47   -2.1  STACK:GG/CC    
   2   46   -1.3  STACK:GA/CT    
   3   45   -1.3  STACK:AG/TC    
   4   44   -2.1  STACK:GG/CC    
   5   43   -1.5  STACK:GT/CA    
   6   42   -1.3  STACK:TC/AG    
   7   41   -0.2  BIFURCATION:4n/3h
   9   22   -1.0  STACK:TT/AA    
  10   21   -0.9  STACK:TA/AT    
  11   20   -1.5  STACK:AC/TG    
  12   19    3.1  HAIRPIN:CA/GG  
  25   39   -2.1  STACK:CC/GG    
  26   38   -2.2  STACK:CG/GC    
  27   37   -2.1  STACK:GG/CC    
  28   36    3.4  HAIRPIN:GT/CT

Without that one extra line I get NameError...

$ head -3  seqfold-demo.py # without `from typing import List` line
from seqfold import dg, dg_cache, fold, Struct

# just returns minimum free energy

$ python seqfold-demo.py
Traceback (most recent call last):
  File "seqfold-demo.py", line 7, in <module>
    structs: List[Struct] = fold("GGGAGGTCGTTACATCTGGGTAACACCGGTACTGATCCGGTGACCTCCC")
NameError: name 'List' is not defined
$ python --version
Python 3.8.8
jjti commented 3 years ago

Thank you for reaching out and taking interest in seqfold! I'm very interested in the Vienna output, please let me know how it goes. Will you CR? Seems like an interesting feature either way

Thank you for the README catches, updating ddg and the use of the Typing list now. I just ran the below (copy-paste from README) and confirmed that it worked:

>>> # `fold` returns a list of `seqfold.Struct` from the minimum free energy structure
>>> structs = fold("GGGAGGTCGTTACATCTGGGTAACACCGGTACTGATCCGGTGACCTCCC")
>>> print(sum(s.e for s in structs))  # -12.94, same as dg()
-13.4
>>> for struct in structs:
...     print(struct) # prints the i, j, ddg, and description of each structure
... 
   0   48   -1.8  STACK:GG/CC    
   1   47   -1.8  STACK:GG/CC    
   2   46   -1.3  STACK:GA/CT    
   3   45   -1.3  STACK:AG/TC    
   4   44   -1.8  STACK:GG/CC    
   5   43   -1.5  STACK:GT/CA    
   6   42   -1.3  STACK:TC/AG    
   7   41   -0.2  BIFURCATION:4n/3h
   9   22   -1.0  STACK:TT/AA    
  10   21   -0.6  STACK:TA/AT    
  11   20   -1.5  STACK:AC/TG    
  12   19    3.1  HAIRPIN:CA/GG  
  25   39   -1.8  STACK:CC/GG    
  26   38   -2.2  STACK:CG/GC    
  27   37   -1.8  STACK:GG/CC    
  28   36    3.4  HAIRPIN:GT/CT  
>>> # `dg_cache` returns a 2D array where each (i,j) combination returns the MFE from i to j inclusive
>>> cache = dg_cache("GGGAGGTCGTTACATCTGGGTAACACCGGTACTGATCCGGTGACCTCCC")

I bumped the Python requirement to 3.5

DDG

DDG, often referred to as ΔΔG, is the change in the change in Gibbs free energy (double changes intended). DDG is a measure of the change in energy between the folded and unfolded states (ΔG​folding​) and the change in ΔG​folding​ when a point mutation is present. https://cyrusbio.com/wp-content/uploads/Rosetta-Cartesian-DDG-2019.pdf

jjti commented 3 years ago

README should be fixed now, thanks again