Taxonomic tree and taxonomic annotation from phylogenetic placement are now picked up and used in downstream (QIIME2) analysis.
The additional code in FORMAT_PPLACETAX uses only the taxonomic annotation from Gappa that
- has the lowest LWR,
- if multiple,
- if identical it will choose the first one,
- if not identical, it will choose the one with less taxonomic entries,
- if same number of taxonomic entries, it will remove entries until they are identical,
- if all removed it will add a `NA`.
I searched, tested and dismissed several pre-made code/commands/functions for this, such as from R's stringi or stringr packages, so I made the custom code.
PR checklist
[x] This comment contains a description of changes (with reason).
[x] If you've fixed a bug or added code that should be tested, add tests!
[ ] If you've added a new tool - have you followed the pipeline conventions in the contribution docs- [ ] If necessary, also make a PR on the nf-core/ampliseq branch on the nf-core/test-datasets repository.
[x] Make sure your code lints (nf-core lint).
[x] Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
[ ] Usage Documentation in docs/usage.md is updated.
[x] Output Documentation in docs/output.md is updated.
[x] CHANGELOG.md is updated.
[x] README.md is updated (including new tool citations and authors/contributors).
Closes https://github.com/nf-core/ampliseq/issues/561
Taxonomic tree and taxonomic annotation from phylogenetic placement are now picked up and used in downstream (QIIME2) analysis.
The additional code in
FORMAT_PPLACETAX
uses only the taxonomic annotation from Gappa thatI searched, tested and dismissed several pre-made code/commands/functions for this, such as from R's stringi or stringr packages, so I made the custom code.
PR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).