MikeAxtell / ShortStack

ShortStack: Comprehensive annotation and quantification of small RNA genes
MIT License
88 stars 29 forks source link

Increase multi-mapping positions #76

Closed DiegoZavallo closed 3 years ago

DiegoZavallo commented 6 years ago

Hi Mike, I have a sRNA-Seq data and I want to mapped them without a specific --locifile so it will create a Shortstack.gff3 of sRNA_loci. Since many sRNAs map to TEs or repeat regions it would be expected that several of them encounter more than 50 positions in the genome, so if I ran Shortstack with the --mmap u options they will be discarded right? I was wondering if I can use the --bowtie-m option but with a higher numer of allowed multimapping like 100, or 500 with the --mmap u option together. So I can find repeat regions greater than 50 but with the "unique- seeded guide" option.

Thanks

Best

Diego

MikeAxtell commented 6 years ago

Hi Diego,

Thanks for your message. Unfortunately, it's not an option to override the 50 mapping limit when running in --mmap u mode. Unless you edited the source code yourself. An alternative to editing the source code would be to take all the reads that violated the 50 limit after a ShortStack alignment, and then re-aligning them just using bowtie. This could be done by parsing the bam files, and then re-merging.

Yes, the extreme multi-mappers are definitely a conundrum. We designed the 50 limit because in our testing we found that placements of reads with higher numbers of possible placements were always just random guesses. But there are certainly situations where one would like to guess.

Hope this helps,

Mike

On Thu, Jul 5, 2018 at 3:12 PM DiegoZvallo notifications@github.com wrote:

Hi Mike, I have a sRNA-Seq data and I want to mapped them without a specific --locifile so it will create a Shortstack.gff3 of sRNA_loci. Since many sRNAs map to TEs or repeat regions it would be expected that several of them encounter more than 50 positions in the genome, so if I ran Shortstack with the --mmap u options they will be discarded right? I was wondering if I can use the --bowtie-m option but with a higher numer of allowed multimapping like 100, or 500 with the --mmap u option together. So I can find repeat regions greater than 50 but with the "unique- seeded guide" option.

Thanks

Best

Diego

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MikeAxtell/ShortStack/issues/76, or mute the thread https://github.com/notifications/unsubscribe-auth/AGiXibWU1oigxABUS3Pv2qwpi6ILU81Gks5uDmUUgaJpZM4VEZmt .

-- Michael J. Axtell, Ph.D. Professor of Biology Penn State University http://sites.psu.edu/axtell

DiegoZavallo commented 6 years ago

Hi Mike, I understand, so maybe I could remap all the data just with bowtie_m all, because what I want is a .gff3 file of sRNA_loci. If I map without multimapping restriction the sRNAs that mapped near centromeric regions could form a sRNA_loci (Cluster) and then I can use that to map my experiments with the -mmap u option. Am I thinking correctly?

Diego

MikeAxtell commented 6 years ago

Yes I think that sounds right. Good luck!

On Fri, Jul 6, 2018 at 9:48 AM DiegoZvallo notifications@github.com wrote:

Hi Mike, I understand, so maybe I could remap all the data just with bowtie_m all, because what I want is a .gff3 file of sRNA_loci. If I map without multimapping restriction the sRNAs that mapped near centromeric regions could form a sRNA_loci (Cluster) and then I can use that to map my experiments with the -mmap u option. Am I thinking correctly?

Diego

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MikeAxtell/ShortStack/issues/76#issuecomment-403039717, or mute the thread https://github.com/notifications/unsubscribe-auth/AGiXiRu0hvHnBdU29CrsbBURqGcN4w9-ks5uD2q8gaJpZM4VEZmt .

-- Michael J. Axtell, Ph.D. Professor of Biology Penn State University http://sites.psu.edu/axtell