mhalushka / miRge3.0

Comprehensive analysis of small RNA sequencing data
MIT License
27 stars 12 forks source link

question regarding crThreshold #23

Closed fmughal closed 2 years ago

fmughal commented 2 years ago

Hello,

I was curious to know how you (the authors) settled on a value of 0.1 as a default for crThreshold. Is this based on previous literature or was it based on simulations you performed? Just to confirm, if the canonical to isomiR ratio falls below crThreshold, does the canonical miRNA also get dropped from the mapped df or is it just the isomiRs that get dropped for that specific miRNA?

Thank you.

mhalushka commented 2 years ago

Hi. We came to the 0.1 value after reviewing the canonical/isomiR ratio of known, robust microRNAs and comparing them to some of the questionable microRNAs in which most of the reads were isomiRs. Almost all robust microRNAs are >70% canonical reads (matching the genomic sequence, even if different lengths). Our original strategy in miRge 1.0 was to require at least two canonical reads, but we found that did result is some "miRNAs" with >95% isomiRs slipping through when there was great sequencing depth. I think we evaluated 0.02, 0.05, and 0.1 thresholds, and settled on 0.1 for this version of miRge. If a miRNA drops below the 0.1, all reads are dropped for the miRNA (canonical and isomiR). If you are interested in trying different thresholds, please let us know what you find.

fmughal commented 2 years ago

This is tremendously helpful. Thank you Marc!