rkweku / miRador

Plant miRNA identification tool that utilizes a variety of filters to validate predicted miRNAs
GNU General Public License v3.0
5 stars 2 forks source link

Precursor sequence output enhancements #14

Open MikeAxtell opened 3 months ago

MikeAxtell commented 3 months ago

HI again Reza, I have a few suggestions for enhancements of the output of miRador:

  1. The file 'precursors.fa' matches the files 'preAnnotatedCandidates.csv' and 'preAnnotatedCandidates.fa'. However, in the README, it states that the files 'finalAnnotatedCandidates.csv' and 'finalAnnotatedCandidates.fa' are the definitive output (after annotation). The names of the entries can change if one compares the 'finalAnnotatedCandidates' relative to the 'preAnnotatedCandidates'. Also, the number of entries diverges from the number found in the 'preAnnotatedCandidates'. It would be very useful if a "final" file of precursor sequences were output. At present, there is no file of precursor FASTA sequences that matches the 'final' output.
  2. The file 'precursors.fa' always shows the top (+) genomic strand, even when the MIRNA locus is on the (-) genomic strand. Because MIRNA precursors are single-stranded, I suggest doing the reverse-comps for loci that are on the (-) genomic strand.
  3. Related to the above, it would be helpful in many circumstances if miRador would list the genomic coordinates of the precursors. At present, only the genomic coordinates of the mature miRNA and the miRNA* are shown (in the 'finalAnnotatedCandidates.csv' file).

I hope these suggestions are helpful!

Mike

rkweku commented 3 months ago

Hi Mike,

Thanks so much for all of these suggestions! I'll be trying to go through these and push an update as soon as possible.

Reza