OstfriesenBI / PredmiRNA

A set of scripts and tools to train a classifier for pre-miRNA Recognition
1 stars 0 forks source link

Feature calculation: Maximal length of the nucleic acid string without stop codons (3ORF) #12

Closed Finesim97 closed 5 years ago

Finesim97 commented 5 years ago

Input: csv file with the sequences:

"comment","sequence","realmiRNA"
"mmu-mir-380 MI0000797 Mus musculus miR-380 stem-loop","AAGAUG",1
"mmu-mir-381 MI0000798 Mus musculus miR-381 stem-loop","AAUUC",1

Output: csv file with the sequence identifier and the length of the sequence until a Stop codon (UAG, UAA, UGA) in ANY of the three possible reading frames.

"comment","lengthToUAG","lengthToUAA","lengthToUGA"

Source Paper: HuntMi: an efficient and taxon-specific approach in pre-miRNA identification