nf-core / ampliseq

Amplicon sequencing analysis workflow using DADA2 and QIIME2
https://nf-co.re/ampliseq
MIT License
187 stars 117 forks source link

Added QIIME2 custom reference database support. #667

Closed MatthewJM96 closed 11 months ago

MatthewJM96 commented 11 months ago

Added support for using custom reference databases in QIIME2 taxonomic classification via the --qiime_ref_tax_custom flag. This brings QIIME2 taxonomic classification into alignment with Kraken and Dada which allow the same.

Testing should probably be added, I could do with some advice on how to make this possible with some reduced database that matches the requirement on what can be passed to the flag (must be a directory or tarball as in the Kraken implementation).

PR checklist

github-actions[bot] commented 11 months ago

nf-core lint overall result: Passed :white_check_mark: :warning:

Posted for pipeline commit 6b71e4d

+| ✅ 154 tests passed       |+
#| ❔   3 tests were ignored |#
!| ❗   2 tests had warnings |!
### :heavy_exclamation_mark: Test warnings: * [readme](https://nf-co.re/tools-docs/lint_tests/readme.html) - README did not have a Nextflow minimum version badge. * [schema_lint](https://nf-co.re/tools-docs/lint_tests/schema_lint.html) - Parameter `input` is not defined in the correct subschema (input_output_options) ### :grey_question: Tests ignored: * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File is ignored: `conf/igenomes.config` * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - File ignored due to lint config: `.gitattributes` * [actions_ci](https://nf-co.re/tools-docs/lint_tests/actions_ci.html) - actions_ci ### :white_check_mark: Tests passed: * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.gitattributes` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.gitignore` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.nf-core.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.editorconfig` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.prettierignore` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.prettierrc.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `CHANGELOG.md` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `CITATIONS.md` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `CODE_OF_CONDUCT.md` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `CODE_OF_CONDUCT.md` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `LICENSE` or `LICENSE.md` or `LICENCE` or `LICENCE.md` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `nextflow_schema.json` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `nextflow.config` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `README.md` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.github/.dockstore.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.github/CONTRIBUTING.md` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.github/ISSUE_TEMPLATE/bug_report.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.github/ISSUE_TEMPLATE/config.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.github/ISSUE_TEMPLATE/feature_request.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.github/PULL_REQUEST_TEMPLATE.md` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.github/workflows/branch.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.github/workflows/ci.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.github/workflows/linting_comment.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.github/workflows/linting.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `assets/email_template.html` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `assets/email_template.txt` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `assets/sendmail_template.txt` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `assets/nf-core-ampliseq_logo_light.png` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `conf/modules.config` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `conf/test.config` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `conf/test_full.config` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `docs/images/nf-core-ampliseq_logo_light.png` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `docs/images/nf-core-ampliseq_logo_dark.png` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `docs/output.md` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `docs/README.md` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `docs/README.md` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `docs/usage.md` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `lib/nfcore_external_java_deps.jar` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `lib/NfcoreTemplate.groovy` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `lib/Utils.groovy` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `lib/WorkflowMain.groovy` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `main.nf` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `assets/multiqc_config.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `conf/base.config` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.github/workflows/awstest.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `.github/workflows/awsfulltest.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `lib/WorkflowAmpliseq.groovy` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `modules.json` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File found: `pyproject.toml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `Singularity` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `parameters.settings.json` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `pipeline_template.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `.nf-core.yaml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `bin/markdown_to_html.r` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `conf/aws.config` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `.github/workflows/push_dockerhub.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `.github/ISSUE_TEMPLATE/bug_report.md` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `.github/ISSUE_TEMPLATE/feature_request.md` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `docs/images/nf-core-ampliseq_logo.png` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `.markdownlint.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `.yamllint.yml` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `lib/Checks.groovy` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `lib/Completion.groovy` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `lib/Workflow.groovy` * [files_exist](https://nf-co.re/tools-docs/lint_tests/files_exist.html) - File not found check: `.travis.yml` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `manifest.name` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `manifest.nextflowVersion` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `manifest.description` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `manifest.version` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `manifest.homePage` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `timeline.enabled` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `trace.enabled` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `report.enabled` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `dag.enabled` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `process.cpus` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `process.memory` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `process.time` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `params.outdir` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `params.input` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `params.validationShowHiddenParams` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `params.validationSchemaIgnoreParams` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `manifest.mainScript` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `timeline.file` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `trace.file` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `report.file` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable found: `dag.file` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable (correctly) not found: `params.nf_required_version` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable (correctly) not found: `params.container` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable (correctly) not found: `params.singleEnd` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable (correctly) not found: `params.igenomesIgnore` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable (correctly) not found: `params.name` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable (correctly) not found: `params.enable_conda` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config ``timeline.enabled`` had correct value: ``true`` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config ``report.enabled`` had correct value: ``true`` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config ``trace.enabled`` had correct value: ``true`` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config ``dag.enabled`` had correct value: ``true`` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config ``manifest.name`` began with ``nf-core/`` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable ``manifest.homePage`` began with https://github.com/nf-core/ * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config ``dag.file`` ended with ``.html`` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config variable ``manifest.nextflowVersion`` started with >= or !>= * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config ``manifest.version`` ends in ``dev``: ``2.8.0dev`` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config `params.custom_config_version` is set to `master` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Config `params.custom_config_base` is set to `https://raw.githubusercontent.com/nf-core/configs/master` * [nextflow_config](https://nf-co.re/tools-docs/lint_tests/nextflow_config.html) - Lines for loading custom profiles found * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `.prettierrc.yml` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `CODE_OF_CONDUCT.md` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `LICENSE` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `.github/.dockstore.yml` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `.github/CONTRIBUTING.md` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `.github/ISSUE_TEMPLATE/bug_report.yml` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `.github/ISSUE_TEMPLATE/config.yml` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `.github/ISSUE_TEMPLATE/feature_request.yml` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `.github/PULL_REQUEST_TEMPLATE.md` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `.github/workflows/branch.yml` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `.github/workflows/linting_comment.yml` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `.github/workflows/linting.yml` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `assets/email_template.html` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `assets/email_template.txt` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `assets/sendmail_template.txt` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `assets/nf-core-ampliseq_logo_light.png` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `docs/images/nf-core-ampliseq_logo_light.png` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `docs/images/nf-core-ampliseq_logo_dark.png` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `docs/README.md` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `lib/nfcore_external_java_deps.jar` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `lib/NfcoreTemplate.groovy` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `.gitignore` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `.prettierignore` matches the template * [files_unchanged](https://nf-co.re/tools-docs/lint_tests/files_unchanged.html) - `pyproject.toml` matches the template * [actions_awstest](https://nf-co.re/tools-docs/lint_tests/actions_awstest.html) - '.github/workflows/awstest.yml' is triggered correctly * [actions_awsfulltest](https://nf-co.re/tools-docs/lint_tests/actions_awsfulltest.html) - `.github/workflows/awsfulltest.yml` is triggered correctly * [actions_awsfulltest](https://nf-co.re/tools-docs/lint_tests/actions_awsfulltest.html) - `.github/workflows/awsfulltest.yml` does not use `-profile test` * [readme](https://nf-co.re/tools-docs/lint_tests/readme.html) - README Zenodo placeholder was replaced with DOI. * [pipeline_todos](https://nf-co.re/tools-docs/lint_tests/pipeline_todos.html) - No TODO strings found * [pipeline_name_conventions](https://nf-co.re/tools-docs/lint_tests/pipeline_name_conventions.html) - Name adheres to nf-core convention * [template_strings](https://nf-co.re/tools-docs/lint_tests/template_strings.html) - Did not find any Jinja template strings (266 files) * [schema_lint](https://nf-co.re/tools-docs/lint_tests/schema_lint.html) - Schema lint passed * [schema_lint](https://nf-co.re/tools-docs/lint_tests/schema_lint.html) - Schema title + description lint passed * [schema_params](https://nf-co.re/tools-docs/lint_tests/schema_params.html) - Schema matched params returned from nextflow config * [system_exit](https://nf-co.re/tools-docs/lint_tests/system_exit.html) - No `System.exit` calls found * [actions_schema_validation](https://nf-co.re/tools-docs/lint_tests/actions_schema_validation.html) - Workflow validation passed: clean-up.yml * [actions_schema_validation](https://nf-co.re/tools-docs/lint_tests/actions_schema_validation.html) - Workflow validation passed: linting_comment.yml * [actions_schema_validation](https://nf-co.re/tools-docs/lint_tests/actions_schema_validation.html) - Workflow validation passed: fix-linting.yml * [actions_schema_validation](https://nf-co.re/tools-docs/lint_tests/actions_schema_validation.html) - Workflow validation passed: branch.yml * [actions_schema_validation](https://nf-co.re/tools-docs/lint_tests/actions_schema_validation.html) - Workflow validation passed: linting.yml * [actions_schema_validation](https://nf-co.re/tools-docs/lint_tests/actions_schema_validation.html) - Workflow validation passed: ci.yml * [actions_schema_validation](https://nf-co.re/tools-docs/lint_tests/actions_schema_validation.html) - Workflow validation passed: awsfulltest.yml * [actions_schema_validation](https://nf-co.re/tools-docs/lint_tests/actions_schema_validation.html) - Workflow validation passed: release-announcments.yml * [actions_schema_validation](https://nf-co.re/tools-docs/lint_tests/actions_schema_validation.html) - Workflow validation passed: awstest.yml * [merge_markers](https://nf-co.re/tools-docs/lint_tests/merge_markers.html) - No merge markers found in pipeline files * [modules_json](https://nf-co.re/tools-docs/lint_tests/modules_json.html) - Only installed modules found in `modules.json` * [multiqc_config](https://nf-co.re/tools-docs/lint_tests/multiqc_config.html) - 'assets/multiqc_config.yml' follows the ordering scheme of the minimally required plugins. * [multiqc_config](https://nf-co.re/tools-docs/lint_tests/multiqc_config.html) - 'assets/multiqc_config.yml' contains a matching 'report_comment'. * [multiqc_config](https://nf-co.re/tools-docs/lint_tests/multiqc_config.html) - 'assets/multiqc_config.yml' contains 'export_plots: true'. * [modules_structure](https://nf-co.re/tools-docs/lint_tests/modules_structure.html) - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL' ### Run details * nf-core/tools version 2.10 * Run at `2023-12-19 09:22:05`
MatthewJM96 commented 11 months ago

Thanks for that PR! Looks good to me, see comments below.

I think a proper test file could be created with greengenes85 with files

https://github.com/nf-core/ampliseq/blob/1067c7c2c88861635955f32adc7a2682885fe27a/conf/ref_databases.config#L305 and uploaded to https://github.com/nf-core/test-datasets/tree/ampliseq/testdata (maybe into a new folder such as "DB") if small enough and activated in e.g. https://github.com/nf-core/ampliseq/blob/dev/conf/test_reftaxcustom.config which would require an update of https://github.com/nf-core/ampliseq/blob/dev/tests/pipeline/reftaxcustom.nf.test and

https://github.com/nf-core/ampliseq/blob/1067c7c2c88861635955f32adc7a2682885fe27a/tests/pipeline/reftaxcustom.nf.test.snap#L16

I've begun putting some testing in, I guess based on supporting a few input forms it might be good to create a couple of smaller tests like reftaxcustom just for qiime2. Especially as right now reftaxcustom wants to disable downstream analysis after dada and kraken2. I could similarly disable some downstream things for qiime but right now that's controlled by run_qiime2.

In our usecase, this work is a predecessor to adding consensus blast processing in qiime2, which I'll try to get in working order for a PR in time. Does it therefore make sense to separate out the logic that sets run_qiime2 for differentiating between the downstream analysis in qiime and the taxonomic alignment in qiime?

d4straub commented 11 months ago

I've begun putting some testing in

Yes, separate tests are fine if its not really possible to fit into existing tests.

Does it therefore make sense to separate out the logic that sets run_qiime2 for differentiating between the downstream analysis in qiime and the taxonomic alignment in qiime?

run_qiime2 is used for taxonomic classification here and for downstream analysis here. I guess it could make sense to separate those (maybe here) into run_qiime2_downstreamanaylsis and run_qiime2_taxonomy and potentially add another one for blast consensus (or keep blast consensus & scikit learn in "taxonomy"). Just do it as intuitive as possible and as easy to maintain as possible (keep checks to a minimum).

erikrikarddaniel commented 11 months ago

[...]

Does it therefore make sense to separate out the logic that sets run_qiime2 for differentiating between the downstream analysis in qiime and the taxonomic alignment in qiime?

run_qiime2 is used for taxonomic classification here and for downstream analysis here. I guess it could make sense to separate those (maybe here) into run_qiime2_downstreamanaylsis and run_qiime2_taxonomy and potentially add another one for blast consensus (or keep blast consensus & scikit learn in "taxonomy"). Just do it as intuitive as possible and as easy to maintain as possible (keep checks to a minimum).

The idea would be that some users might want QIIME's taxonomy but not all the rest? If so, why not keep --skip_qiime but allow it to be combined with --qiime_taxonomy instead of the quite long params you suggest?

MatthewJM96 commented 11 months ago

I've begun putting some testing in

Yes, separate tests are fine if its not really possible to fit into existing tests.

I've put a test with tarball into the existing reftaxcustom case, and added a qiimecustom that tests with a file pair. I think that's a reasonable balance to test different input patterns.

Does it therefore make sense to separate out the logic that sets run_qiime2 for differentiating between the downstream analysis in qiime and the taxonomic alignment in qiime?

run_qiime2 is used for taxonomic classification here and for downstream analysis here. I guess it could make sense to separate those (maybe here) into run_qiime2_downstreamanaylsis and run_qiime2_taxonomy and potentially add another one for blast consensus (or keep blast consensus & scikit learn in "taxonomy"). Just do it as intuitive as possible and as easy to maintain as possible (keep checks to a minimum).

The idea would be that some users might want QIIME's taxonomy but not all the rest? If so, why not keep --skip_qiime but allow it to be combined with --qiime_taxonomy instead of the quite long params you suggest?

I've added a --skip_qiime_downstream flag and separated out calculation of a run_qiime2 that applies to downstream and a run_qiime2_taxonomy that applies just to the taxonomy stage, using this is the test cases.

MatthewJM96 commented 11 months ago

Hi Matthew,

looks great. Did you run your newly added tests and test itself and made sure files look fine? If not I'll have a look before I give my ok here.

I ran them manually but also added the new test to the CI pipeline and it looks like it passed. I couldn't figure a good way to snapshot the QIIME taxonomic classification as I guess the algorithm isn't deterministic, so I just checked that the classifier and taxonomy tsv report are produced.

Edit: caught one more assertion that was bad, one of the failing tests (doubleprimers) in this round looks spurious, and the test is succeeding on my own end so hopefully succeeds in this rerun.

d4straub commented 11 months ago

Thanks, I have the feeling I should also run a few tests just to make sure, I scheduled some time tomorrow, so I expect to approve the PR then.

d4straub commented 11 months ago

I tested, and found: (1) there is something wrong with phyloseq, I attempted to fix it in https://github.com/nf-core/ampliseq/pull/676 (2) when running nextflow run MatthewJM96/ampliseq -r qiime2_custom_db -profile test_qiimecustom,singularity --outdir result_test_qiimecustom_qiime2_custom_db_23-12-12 I found that result_test_qiimecustom_qiime2_custom_db_23-12-12/summary_report/summary_report.html didnt contain the section about the taxonomy, https://github.com/nf-core/ampliseq/pull/673 should fix that.

I think the ideal sequence should be: ~(1) After https://github.com/nf-core/ampliseq/pull/676 is merged~ ~(2) integrate dev into that PR~ ~(3) revert https://github.com/nf-core/ampliseq/pull/667/commits/4464c38cef7be3e9309c3d036fda7172aba130a4~ (4) all should be fine (check if summary_report.html is fine with -profile test_qiimecustom) and we merge.

Sorry that you run into that phyloseq bug, I hope it works now.

edit: some points are solved above edit2: summary_report.html with -profile test_qiimecustom does not contain taxonomy section yet. Not sure what preventing it...

d4straub commented 11 months ago

I found the problem and fixed the report. When all tests passed I'll merge it if you do not have any objections.

MatthewJM96 commented 10 months ago

I found the problem and fixed the report. When all tests passed I'll merge it if you do not have any objections.

Sorry for no reply, was on leave! Thanks for looking at those last things and the advice along the way.