Automated minimum read length "-m" setting

andersen-lab / ivar

iVar is a computational package that contains functions broadly useful for viral amplicon-based sequencing.

https://andersen-lab.github.io/ivar/html/

GNU General Public License v3.0

115 stars 39 forks source link

Automated minimum read length "-m" setting #143

Closed joshuailevy closed 1 year ago

joshuailevy commented 1 year ago

At the moment, the default setting for the -m parameter is often too small to remove a lot of sequencing artifacts (e.g., primer dimers), but it would seem that using the actual read lengths from the dataset would be a more robust default (say, using -m 100 for a dataset with 150bp reads, possibly implemented as percentage of the expected read length?). I know a lot of users aren't aware of the importance of this parameter's value, and resulting seq outputs can potentially be impacted.

cmaceves commented 1 year ago

thanks for raising this issue Josh, will keep this in mind for future releases!

cmaceves commented 1 year ago

addressed in PR #155