Closed d4straub closed 3 months ago
nf-core lint
overall result: Passed :white_check_mark: :warning:Posted for pipeline commit f15c9dd
+| ✅ 191 tests passed |+
#| ❔ 6 tests were ignored |#
!| ❗ 1 tests had warnings |!
Thanks!
Thanks for your review! Unfortunately I seem to have merged the PR just before your comment. I actually do not know what methods require a minimal length and which not, just DADA2 seems certain. I considered 2 solutions: (1) adding a filter after ITSx to control its output for all downstream processes, or (2) adding a length filter before DADA2. My impression was that short sequences are generally not advisable to classify (and didnt know that sintax has a build-in filter) and therefore I decided for option (1). Would you have preferred option (2)?
Yes, I realized I was too slow when posting my comment :D I'm fine with this solution too.
Sequences below 50bp cannot be used by DADA2 and most likely also other kmer-based classification methods fail with too short sequences (see
https://github.com/benjjneb/dada2/issues/601
). This can be usually easily prevented with--min_len_asv
, but not when ITSx is used (because this is after--min_len_asv
is appplied). Therefore, another length filtering step with 50bp min length is added here. The minimum value of 50 is not represented by a parameter but can be adjusted via a config, if required.Closes https://github.com/nf-core/ampliseq/issues/704.
PR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).nextflow run . -profile debug,test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).