Closed MatthewJM96 closed 11 months ago
nf-core lint
overall result: Passed :white_check_mark: :warning:Posted for pipeline commit 6b71e4d
+| ✅ 154 tests passed |+
#| ❔ 3 tests were ignored |#
!| ❗ 2 tests had warnings |!
Thanks for that PR! Looks good to me, see comments below.
I think a proper test file could be created with greengenes85 with files
https://github.com/nf-core/ampliseq/blob/1067c7c2c88861635955f32adc7a2682885fe27a/conf/ref_databases.config#L305 and uploaded to https://github.com/nf-core/test-datasets/tree/ampliseq/testdata (maybe into a new folder such as "DB") if small enough and activated in e.g. https://github.com/nf-core/ampliseq/blob/dev/conf/test_reftaxcustom.config which would require an update of https://github.com/nf-core/ampliseq/blob/dev/tests/pipeline/reftaxcustom.nf.test and
I've begun putting some testing in, I guess based on supporting a few input forms it might be good to create a couple of smaller tests like reftaxcustom just for qiime2. Especially as right now reftaxcustom wants to disable downstream analysis after dada and kraken2. I could similarly disable some downstream things for qiime but right now that's controlled by run_qiime2.
In our usecase, this work is a predecessor to adding consensus blast processing in qiime2, which I'll try to get in working order for a PR in time. Does it therefore make sense to separate out the logic that sets run_qiime2 for differentiating between the downstream analysis in qiime and the taxonomic alignment in qiime?
I've begun putting some testing in
Yes, separate tests are fine if its not really possible to fit into existing tests.
Does it therefore make sense to separate out the logic that sets run_qiime2 for differentiating between the downstream analysis in qiime and the taxonomic alignment in qiime?
run_qiime2
is used for taxonomic classification here and for downstream analysis here. I guess it could make sense to separate those (maybe here) into run_qiime2_downstreamanaylsis
and run_qiime2_taxonomy
and potentially add another one for blast consensus (or keep blast consensus & scikit learn in "taxonomy"). Just do it as intuitive as possible and as easy to maintain as possible (keep checks to a minimum).
[...]
Does it therefore make sense to separate out the logic that sets run_qiime2 for differentiating between the downstream analysis in qiime and the taxonomic alignment in qiime?
run_qiime2
is used for taxonomic classification here and for downstream analysis here. I guess it could make sense to separate those (maybe here) intorun_qiime2_downstreamanaylsis
andrun_qiime2_taxonomy
and potentially add another one for blast consensus (or keep blast consensus & scikit learn in "taxonomy"). Just do it as intuitive as possible and as easy to maintain as possible (keep checks to a minimum).
The idea would be that some users might want QIIME's taxonomy but not all the rest? If so, why not keep --skip_qiime
but allow it to be combined with --qiime_taxonomy
instead of the quite long params you suggest?
I've begun putting some testing in
Yes, separate tests are fine if its not really possible to fit into existing tests.
I've put a test with tarball into the existing reftaxcustom case, and added a qiimecustom that tests with a file pair. I think that's a reasonable balance to test different input patterns.
Does it therefore make sense to separate out the logic that sets run_qiime2 for differentiating between the downstream analysis in qiime and the taxonomic alignment in qiime?
run_qiime2
is used for taxonomic classification here and for downstream analysis here. I guess it could make sense to separate those (maybe here) intorun_qiime2_downstreamanaylsis
andrun_qiime2_taxonomy
and potentially add another one for blast consensus (or keep blast consensus & scikit learn in "taxonomy"). Just do it as intuitive as possible and as easy to maintain as possible (keep checks to a minimum).The idea would be that some users might want QIIME's taxonomy but not all the rest? If so, why not keep
--skip_qiime
but allow it to be combined with--qiime_taxonomy
instead of the quite long params you suggest?
I've added a --skip_qiime_downstream
flag and separated out calculation of a run_qiime2
that applies to downstream and a run_qiime2_taxonomy
that applies just to the taxonomy stage, using this is the test cases.
Hi Matthew,
looks great. Did you run your newly added tests and
test
itself and made sure files look fine? If not I'll have a look before I give my ok here.
I ran them manually but also added the new test to the CI pipeline and it looks like it passed. I couldn't figure a good way to snapshot the QIIME taxonomic classification as I guess the algorithm isn't deterministic, so I just checked that the classifier and taxonomy tsv report are produced.
Edit: caught one more assertion that was bad, one of the failing tests (doubleprimers) in this round looks spurious, and the test is succeeding on my own end so hopefully succeeds in this rerun.
Thanks, I have the feeling I should also run a few tests just to make sure, I scheduled some time tomorrow, so I expect to approve the PR then.
I tested, and found:
(1) there is something wrong with phyloseq, I attempted to fix it in https://github.com/nf-core/ampliseq/pull/676
(2) when running nextflow run MatthewJM96/ampliseq -r qiime2_custom_db -profile test_qiimecustom,singularity --outdir result_test_qiimecustom_qiime2_custom_db_23-12-12
I found that result_test_qiimecustom_qiime2_custom_db_23-12-12/summary_report/summary_report.html
didnt contain the section about the taxonomy, https://github.com/nf-core/ampliseq/pull/673 should fix that.
I think the ideal sequence should be:
~(1) After https://github.com/nf-core/ampliseq/pull/676 is merged~
~(2) integrate dev into that PR~
~(3) revert https://github.com/nf-core/ampliseq/pull/667/commits/4464c38cef7be3e9309c3d036fda7172aba130a4~
(4) all should be fine (check if summary_report.html
is fine with -profile test_qiimecustom
) and we merge.
Sorry that you run into that phyloseq bug, I hope it works now.
edit: some points are solved above
edit2: summary_report.html
with -profile test_qiimecustom
does not contain taxonomy section yet. Not sure what preventing it...
I found the problem and fixed the report. When all tests passed I'll merge it if you do not have any objections.
I found the problem and fixed the report. When all tests passed I'll merge it if you do not have any objections.
Sorry for no reply, was on leave! Thanks for looking at those last things and the advice along the way.
Added support for using custom reference databases in QIIME2 taxonomic classification via the
--qiime_ref_tax_custom
flag. This brings QIIME2 taxonomic classification into alignment with Kraken and Dada which allow the same.Testing should probably be added, I could do with some advice on how to make this possible with some reduced database that matches the requirement on what can be passed to the flag (must be a directory or tarball as in the Kraken implementation).
PR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).CHANGELOG.md
is updated.