OstfriesenBI / PredmiRNA

A set of scripts and tools to train a classifier for pre-miRNA Recognition
1 stars 0 forks source link

Feature calculation: RNAfold output #14

Closed Finesim97 closed 5 years ago

Finesim97 commented 5 years ago

Program call, R/Python script The rnafold program from the viennarna package produces the secondary structure of a given RNA sequence with the minimum free energy. RNAfold Help The program can be installed using conda, we already have an env file for it. When called with an input file (-i), it prints the sequence comment, the sequence, the bracket notation and the minimum free energy: RNAfold -i rna.fst

>Test123 457Test healp
GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCCCA
(((((((..(((.((((.(....(((((.(((((....)))).)..).))))....).)))).)))))))))). (-29.90)
>fefefe wefwe ffff
GCUGAUUCGAAUUCAGCAGCCCAAAAAAAAAAAAAAAAAAAAAAA
(((((.......)))))............................ ( -6.90)

A PostScript drawing of the structure of the given sequence is stored in the working directory, which has the string until the first whitespace of the comment as the filename + .ps. (Disable: −−noPS)

This output has to be parsed and put into a csv file:

"comment","secstructure","mfe"
"Test123 457Test healp","(((((((..(((.((((.(....(((((.(((((....)))).)..).))))....).)))).)))))))))).",-29.90
"fefefe wefwe ffff", "(((((.......)))))............................",-6.90
Finesim97 commented 5 years ago

948d569 Done.