rkweku / miRador

Plant miRNA identification tool that utilizes a variety of filters to validate predicted miRNAs
GNU General Public License v3.0
5 stars 2 forks source link

miRNAs not found in hairpin sequences, and hairpin sequences not matching what is in final images. #11

Closed Aez35 closed 4 months ago

Aez35 commented 4 months ago

Hi Reza,

I recently ran miRador on Striga hermonthica and rice data. The results output over 200 candidate miRNAs. When I do alignments however, I see only half of the miRNA sequences can be found in the hairpins. Is there a reason this may happen? Additionally, some of the precursor sequences do not match the resulting image file associated with it by a few nt here and there. For example, the image I've included has an "A" in place of a "U" in the precursor.fa sequence file, as well as a few other single nucleotide deviations throughout.

I've included an example, but I have 106 miRNAs that this has happened to.

Final annotated miRNA: TCGGACCAGGCTTCATTCCCC Precursor: GAAAAGCTGAGGGGAATGAAGCCTGGTCCGAGACCGATCATCCGTATGCACGTACGTACTAAAGACAAAAGGTACCGTCTCGAACCAGACAGCATTCCCCACAACATTTC

finalAnnotatedCandidates.csv candidate-323911_1-5p_precursor.pdf

Thanks, Alli

rkweku commented 4 months ago

Hi Alli,

Thanks for writing! So as far as both issues, I believe both can be explained by the sequences being identified on reverse strand meaning the sequences need to be reverse-complemented for you to find what you are expecting. If you could confirm, the cases where you have mismatches all likely were identified on the 'c' strand so you'd have to reverse complement the sequence to find it on this precursor. In the case of the example provided, if you reverse complement the miRNA sequence of TCGGACCAGGCTTCATTCCCC, you will get the sequence GGGGAATGAAGCCTGGTCCGA which does actually exist in the precursor that you provided. This likely extends to the lack of concordance that you observed in the precursor sequence and the RNA-fold image. I also reverse complemented the precursor sequence that you provided and I believe see perfect alignment then between the reverse complemented sequence (GAAAUGUUGUGGGGAAUGCUGUCUGGUUCGAGACGGUACCUUUUGUCUUUAGUACGUACGUGCAUACGGAUGAUCGGUCUCGGACCAGGCUUCAUUCCCCUCAGCUUUUC) and the RNA-fold image.

If you could take a look at some of the cases where you see issues and find that this does not explain all of the examples, please let me know and I'll see if I can help you get to the bottom of this.

Best, Reza

Aez35 commented 4 months ago

Hi Reza,

Just after posting I initially tried taking the reverse complement of the miRNA, which didn't work for my purposes as the reads did not map to the hairpin. I then took the reverse complement of the hairpin after reading your comment and that worked much better! Thank you for your quick response.

Alli