Closed sminot closed 1 year ago
master
branch :x:base
to dev
Hi @sminot,
It looks like this pull-request is has been made against the sminot/ampliseq master
branch.
The master
branch on nf-core repositories should always contain code from the latest release.
Because of this, PRs to master
are only allowed if they come from the sminot/ampliseq dev
branch.
You do not need to close this PR, you can change the target branch to dev
by clicking the "Edit" button at the top of this page.
Note that even after this, the test will continue to show as failing until you push a new commit.
Thanks again for your contribution!
nf-core lint
overall result: Failed :x:Posted for pipeline commit d963baf
+| ✅ 143 tests passed |+
#| ❔ 3 tests were ignored |#
!| ❗ 3 tests had warnings |!
-| ❌ 8 tests failed |-
Hi there,
thanks for providing a fix! I havent come across that case yet. However, I have concerns with this PR.
params.ignore_failed_trimming = true
, it just continues silently, right? That is a little concerning but probably also fine.So the optimal solution to me seems at that point that (1) there is a hint when an error occurs at that step to use a specific parameter, probably mentioning the samples that failed, (2) when the parameter is used a warning lists samples that failed but the pipeline continues. Essentially how https://nf-co.re/ampliseq/2.6.1/parameters/#ignore_empty_input_files & https://nf-co.re/ampliseq/2.6.1/parameters/#ignore_failed_trimming work. My idea would be to create empty output files when there is none produced by the filtntrim command and then filter the outgoing channel with nextflow's .branch
. Would that make sense?
As a note, this is easily reproducible by running nextflow run nf-core/ampliseq -r 2.6.1 -profile test,singularity --outdir results --trunclenf 300 --trunclenr 300
where trunclenf/r will cause discarding all reads because its PE250.
I really appreciate your taking a look at this PR, @d4straub , and I must apologize for not putting it together as thoroughly as I should have. The nf-core/ampliseq workflow is really amazingly helpful, and I appreciate all the work that you all have put into it.
I'm trying to come up with a test case that recapitulates my experience in which 0 reads are output by the DADA2_FILTNTRIM
process, such that no files are created and the process ends with an error. I'm getting the test suite all staged and running on my local system, and I'll see if I can find the right kind of inputs which sufficiently resemble this particular (admittedly extremely low quality) dataset.
I like your idea of using empty outputs from the process, since that would ensure that the dropped samples are still logged as being missing in the same way that the current skipping behavior will document. Given the utility of this workflow, I think it will be worthwhile to add that particular feature.
I'll work on getting that running using a new test dataset which is appropriate for this edge case. I'll update you when I have something pushed that is ready for your inspection.
Thanks! I was testing a little bit and actually, the solution of https://nf-co.re/ampliseq/2.6.1/parameters/#ignore_empty_input_files & https://nf-co.re/ampliseq/2.6.1/parameters/#ignore_failed_trimming, i.e. having a verbose error, having an empty file and filtering it with .branch
, reporting samples and warnings, went fine until it clashed in:
.join
, because sample ids, which are in meta.id
, didnt match (because some samples were removed!)meta.id
, the pipeline failed in a subsequent step where meta.id
was required (dont have my notes with me, not sure it was, some sort of merging in R where cbind
failed because of missing columns, i.e. samples...)at that point I found it got too complicated. So probably that approach is not perfect in that case. But I definately didnt come up with an elegant solution. Probably you do it better, just wanted to share my failed attempt that you might benefit from it ;)
This is actually really relevant @d4straub, since I'm trying to debug a dataset which stops at basically the exact point you describe, but which I thought should still pass filtering for at least some samples.
I'm doing more local testing, and I'll see if I can figure this one out and propose a good fix.
Closing in favor of https://github.com/nf-core/ampliseq/pull/641
In my use of this workflow I found an edge case where the filtering step produces 0 reads. In those cases when
params.ignore_failed_trimming = True
, those samples should be ignored. However, instead I observed that the workflow failed with:By adding the
optional
flag to the output usingparams.ignore_failed_trimming
, this error will be avoided only in those cases when it satisfies the request of the user.My apologies for not adding tests, but this is such a small change that I thought it might be acceptable.
PR checklist
nf-core lint
) (this failed, but not in any part of the code that I touched)nextflow run . -profile test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).