MikeAxtell / ShortStack

ShortStack: Comprehensive annotation and quantification of small RNA genes
MIT License
88 stars 29 forks source link

--pad 200 issue #142

Closed KJeynesCupper closed 9 months ago

KJeynesCupper commented 10 months ago

I have noticed that in the new update the documentation states that as default --pad is 75, the same as previously, but it's set as 200 as default instead?

Also, the new update has removed the --mismatch parameter and --mmap n parameter? Previously, I liked that the allowed mismatches and handling of multi-mapped reads was customizable - what was the reason behind these changes and is there any possibility of these being reintroduced?

Thank you for creating a great tool!

Katie

MikeAxtell commented 10 months ago

Thanks Katie!

I just committed edits to the README and the ShortStack script that fix the --pad default setting documentation issue .. now --pad is now correctly described as having a default of 200, not 75. Thank you for pointing that out. Those commits will be included in the next release.

For the --mismatch and --mmap question: Perhaps a little paternal of me, but those were deliberate design choices. In my testing allowing 0 or 1 mismatches, using bowtie "strata", always performed the best, so I decided it is best to hard code those settings for ShortStack's alignment wrapper. Same for getting rid of --mmap n ... ignoring multi-mapped reads altogether just seemed like a dangerous option for most users. For instance, most the well-conserved and important plant microRNAs would not be mapped in that setting because they are encoded by multiple paralogous genes in most genomes.

ShortStack will accept any correctly formatted, sorted and indexed BAM file (using option --bamfile). So, you can always do your own sRNA-seq alignments with your methods of choice, and then use ShortStack for sRNA cluster discovery and description.

Thanks for the feedback.

Mike

KJeynesCupper commented 10 months ago

No worries, I was messing around with the --pad settings and noticed the discrepancy.

Thank you for going into detail, I can understand why you chose to implement this as standard! I have indeed been mapping with Bowtie separately and then using the ShortStack analysis pipeline, which has been working well for my purposes.

Cheers, Katie