Open ilight1542 opened 11 months ago
master
branch :x:base
to dev
Hi @ilight1542,
It looks like this pull-request is has been made against the nf-core/eager master
branch.
The master
branch on nf-core repositories should always contain code from the latest release.
Because of this, PRs to master
are only allowed if they come from the nf-core/eager dev
branch.
You do not need to close this PR, you can change the target branch to dev
by clicking the "Edit" button at the top of this page.
Note that even after this, the test will continue to show as failing until you push a new commit.
Thanks again for your contribution!
@ilight1542 maltextract+AMPS works now, however, there are many optional parameters, so I'll do the comprehensive testing on friday.
All tests, except Metaphal have passed today (see file attached).
ToDo for the next testing: [] optional parameters [] check the expected output [] update the manual_tests.md file
I'm positive that we finish the metagenomics section in the next weeks :)
should consider also implementing this enhancement for bam filtering https://github.com/nf-core/eager/issues/945
nf-core lint
overall result: Passed :white_check_mark: :warning:Posted for pipeline commit 5414f06
+| ✅ 360 tests passed |+
#| ❔ 1 tests were ignored |#
!| ❗ 22 tests had warnings |!
RE: keeping strandedness. Since the only meta in malt-run is the meta with the list of read files, keeping info on which samples have single-stranded library prep must be done in multiple malt runs (unless we want to slightly rewrite the malt-run module.
Unless we can keep the meta info and then somehow remerge it with the various rma6 files channel that we get from MALT.out.rma6, I think we need to split the rma6 files by strandedness first and then send them into malt
Possibility for maintaining strandedness info for downstream maltextract:
(within metagenomics_profiling.nf reads .branch { doublestranded: it[0].strandedness == 'double' singlestranded: it[0].strandedness == 'single' }.set { strandedness_ch }
RE: keeping strandedness. Since the only meta in malt-run is the meta with the list of read files, keeping info on which samples have single-stranded library prep must be done in multiple malt runs (unless we want to slightly rewrite the malt-run module.
Unless we can keep the meta info and then somehow remerge it with the various rma6 files channel that we get from MALT.out.rma6, I think we need to split the rma6 files by strandedness first and then send them into malt
Possibility for maintaining strandedness info for downstream maltextract:
(within metagenomics_profiling.nf reads .branch { doublestranded: it[0].strandedness == 'double' singlestranded: it[0].strandedness == 'single' }.set { strandedness_ch }
The only downstream process that relies on strandedness information is maltextract, so we should branch as late as possible (After MALT) and concat the channels directly afterwards.
Problem: The maltextract-module doesnt take a meta map in the input channels... Solution: Update the module
RE: keeping strandedness. Since the only meta in malt-run is the meta with the list of read files, keeping info on which samples have single-stranded library prep must be done in multiple malt runs (unless we want to slightly rewrite the malt-run module. Unless we can keep the meta info and then somehow remerge it with the various rma6 files channel that we get from MALT.out.rma6, I think we need to split the rma6 files by strandedness first and then send them into malt Possibility for maintaining strandedness info for downstream maltextract: (within metagenomics_profiling.nf reads .branch { doublestranded: it[0].strandedness == 'double' singlestranded: it[0].strandedness == 'single' }.set { strandedness_ch }
The only downstream process that relies on strandedness information is maltextract, so we should branch as late as possible (After MALT) and concat the channels directly afterwards.
Problem: The maltextract-module doesnt take a meta map in the input channels... Solution: Update the module
And finally...
I would bundle all documentation-related comments into a separate issue, so that we can merge the (working) branch into dev and then finish on the documentation "on top". So that we can do that without going through all the files again and again and without diverging from the dev branch.
Open ToDos from code review (after test profiles)
If I missed anything, please correct me
@jfy133 -- I think it is all set for review once more: a quick update RE: strandedness going into metagenomics screening. The current way that bamfiltering is done, the per-sample outputs (eg mapped R1, R2, singletons, unmapped... ) are always concatenated into a single channel and run independently.
Major revision would be required in the parsing of I/O from bamfiltering into metagenomics to get it working also while maintaining metadata for PE reads. Merlin and I feel this is more appropriate as a separate PR/extension.
@jfy133 -- I think it is all set for review once more: a quick update RE: strandedness going into metagenomics screening. The current way that bamfiltering is done, the per-sample outputs (eg mapped R1, R2, singletons, unmapped... ) are always concatenated into a single channel and run independently.
Major revision would be required in the parsing of I/O from bamfiltering into metagenomics to get it working also while maintaining metadata for PE reads. Merlin and I feel this is more appropriate as a separate PR/extension.
(not strandedness as in double/single stranded libraries, but in sequencing mode (paired end, single read))
Currently all channels coming into the metagenomics have the single_end=true
paramter in the meta.
TODOS:
PR checklist
scrape_software_versions.py
nf-core lint .
).nextflow run . -profile test,docker
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).