nf-core / eager

A fully reproducible and state-of-the-art ancient DNA analysis pipeline
MIT License
129 stars 78 forks source link

DSL2: metagenomics #1019

Open ilight1542 opened 11 months ago

ilight1542 commented 11 months ago


PR checklist

github-actions[bot] commented 11 months ago

This PR is against the master branch :x:

Hi @ilight1542,

It looks like this pull-request is has been made against the nf-core/eager master branch. The master branch on nf-core repositories should always contain code from the latest release. Because of this, PRs to master are only allowed if they come from the nf-core/eager dev branch.

You do not need to close this PR, you can change the target branch to dev by clicking the "Edit" button at the top of this page. Note that even after this, the test will continue to show as failing until you push a new commit.

Thanks again for your contribution!

merszym commented 11 months ago

@ilight1542 maltextract+AMPS works now, however, there are many optional parameters, so I'll do the comprehensive testing on friday.

merszym commented 9 months ago

All tests, except Metaphal have passed today (see file attached).

ToDo for the next testing: [] optional parameters [] check the expected output [] update the file

I'm positive that we finish the metagenomics section in the next weeks :)

ilight1542 commented 9 months ago

should consider also implementing this enhancement for bam filtering

github-actions[bot] commented 8 months ago

nf-core lint overall result: Passed :white_check_mark: :warning:

Posted for pipeline commit 5414f06

+| ✅ 360 tests passed       |+
#| ❔   1 tests were ignored |#
!| ❗  22 tests had warnings |!
### :heavy_exclamation_mark: Test warnings: * [readme]( - README contains the placeholder `zenodo.XXXXXXX`. This should be replaced with the zenodo doi (after the first release). * [pipeline_todos]( - TODO string in ``: _Remove this line if you don't need a FASTA file_ * [pipeline_todos]( - TODO string in `nextflow.config`: _Specify your pipeline's command line flags_ * [pipeline_todos]( - TODO string in ``: _Include a figure that guides the user through the major workflow steps. Many nf-core_ * [pipeline_todos]( - TODO string in ``: _Fill in short bullet-pointed list of the default steps in the pipeline_ * [pipeline_todos]( - TODO string in ``: _Optionally add in-text citation tools to this list._ * [pipeline_todos]( - TODO string in ``: _Optionally add bibliographic entries to this list._ * [pipeline_todos]( - TODO string in ``: _Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!_ * [pipeline_todos]( - TODO string in `methods_description_template.yml`: _#Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline_ * [pipeline_todos]( - TODO string in `awsfulltest.yml`: _You can customise AWS full pipeline tests as required_ * [pipeline_todos]( - TODO string in `ci.yml`: _You can customise CI pipeline run tests as required_ * [pipeline_todos]( - TODO string in ``: _Add documentation about anything specific to running your pipeline. For general topics, please point to (and add to) the main nf-core website._ * [pipeline_todos]( - TODO string in `test.config`: _Specify the paths to your test data on nf-core/test-datasets_ * [pipeline_todos]( - TODO string in `test.config`: _Give any required params for the test so that command line flags are not needed_ * [pipeline_todos]( - TODO string in `test_nothing.config`: _Specify the paths to your test data on nf-core/test-datasets_ * [pipeline_todos]( - TODO string in `test_nothing.config`: _Give any required params for the test so that command line flags are not needed_ * [pipeline_todos]( - TODO string in `base.config`: _Check the defaults for all processes_ * [pipeline_todos]( - TODO string in `base.config`: _Customise requirements for specific processes._ * [pipeline_todos]( - TODO string in `test_full.config`: _Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)_ * [pipeline_todos]( - TODO string in `test_full.config`: _Give any required params for the test so that command line flags are not needed_ * [pipeline_todos]( - TODO string in `test_humanbam.config`: _Specify the paths to your test data on nf-core/test-datasets_ * [pipeline_todos]( - TODO string in `test_humanbam.config`: _Give any required params for the test so that command line flags are not needed_ ### :grey_question: Tests ignored: * [nextflow_config]( - Config default ignored: params.contamination_estimation_angsd_hapmap ### :white_check_mark: Tests passed: * [files_exist]( - File found: `.gitattributes` * [files_exist]( - File found: `.gitignore` * [files_exist]( - File found: `.nf-core.yml` * [files_exist]( - File found: `.editorconfig` * [files_exist]( - File found: `.prettierignore` * [files_exist]( - File found: `.prettierrc.yml` * [files_exist]( - File found: `` * [files_exist]( - File found: `` * [files_exist]( - File found: `` * [files_exist]( - File found: `LICENSE` or `` or `LICENCE` or `` * [files_exist]( - File found: `nextflow_schema.json` * [files_exist]( - File found: `nextflow.config` * [files_exist]( - File found: `` * [files_exist]( - File found: `.github/.dockstore.yml` * [files_exist]( - File found: `.github/` * [files_exist]( - File found: `.github/ISSUE_TEMPLATE/bug_report.yml` * [files_exist]( - File found: `.github/ISSUE_TEMPLATE/config.yml` * [files_exist]( - File found: `.github/ISSUE_TEMPLATE/feature_request.yml` * [files_exist]( - File found: `.github/` * [files_exist]( - File found: `.github/workflows/branch.yml` * [files_exist]( - File found: `.github/workflows/ci.yml` * [files_exist]( - File found: `.github/workflows/linting_comment.yml` * [files_exist]( - File found: `.github/workflows/linting.yml` * [files_exist]( - File found: `assets/email_template.html` * [files_exist]( - File found: `assets/email_template.txt` * [files_exist]( - File found: `assets/sendmail_template.txt` * [files_exist]( - File found: `assets/nf-core-eager_logo_light.png` * [files_exist]( - File found: `conf/modules.config` * [files_exist]( - File found: `conf/test.config` * [files_exist]( - File found: `conf/test_full.config` * [files_exist]( - File found: `docs/images/nf-core-eager_logo_light.png` * [files_exist]( - File found: `docs/images/nf-core-eager_logo_dark.png` * [files_exist]( - File found: `docs/` * [files_exist]( - File found: `docs/` * [files_exist]( - File found: `docs/` * [files_exist]( - File found: `docs/` * [files_exist]( - File found: `` * [files_exist]( - File found: `assets/multiqc_config.yml` * [files_exist]( - File found: `conf/base.config` * [files_exist]( - File found: `conf/igenomes.config` * [files_exist]( - File found: `.github/workflows/awstest.yml` * [files_exist]( - File found: `.github/workflows/awsfulltest.yml` * [files_exist]( - File found: `modules.json` * [files_exist]( - File not found check: `.github/ISSUE_TEMPLATE/` * [files_exist]( - File not found check: `.github/ISSUE_TEMPLATE/` * [files_exist]( - File not found check: `.github/workflows/push_dockerhub.yml` * [files_exist]( - File not found check: `.markdownlint.yml` * [files_exist]( - File not found check: `.nf-core.yaml` * [files_exist]( - File not found check: `.yamllint.yml` * [files_exist]( - File not found check: `bin/markdown_to_html.r` * [files_exist]( - File not found check: `conf/aws.config` * [files_exist]( - File not found check: `docs/images/nf-core-eager_logo.png` * [files_exist]( - File not found check: `lib/Checks.groovy` * [files_exist]( - File not found check: `lib/Completion.groovy` * [files_exist]( - File not found check: `lib/NfcoreTemplate.groovy` * [files_exist]( - File not found check: `lib/Utils.groovy` * [files_exist]( - File not found check: `lib/Workflow.groovy` * [files_exist]( - File not found check: `lib/WorkflowMain.groovy` * [files_exist]( - File not found check: `lib/WorkflowEager.groovy` * [files_exist]( - File not found check: `parameters.settings.json` * [files_exist]( - File not found check: `pipeline_template.yml` * [files_exist]( - File not found check: `Singularity` * [files_exist]( - File not found check: `lib/nfcore_external_java_deps.jar` * [files_exist]( - File not found check: `.travis.yml` * [nextflow_config]( - Config variable found: `` * [nextflow_config]( - Config variable found: `manifest.nextflowVersion` * [nextflow_config]( - Config variable found: `manifest.description` * [nextflow_config]( - Config variable found: `manifest.version` * [nextflow_config]( - Config variable found: `manifest.homePage` * [nextflow_config]( - Config variable found: `timeline.enabled` * [nextflow_config]( - Config variable found: `trace.enabled` * [nextflow_config]( - Config variable found: `report.enabled` * [nextflow_config]( - Config variable found: `dag.enabled` * [nextflow_config]( - Config variable found: `process.cpus` * [nextflow_config]( - Config variable found: `process.memory` * [nextflow_config]( - Config variable found: `process.time` * [nextflow_config]( - Config variable found: `params.outdir` * [nextflow_config]( - Config variable found: `params.input` * [nextflow_config]( - Config variable found: `params.validationShowHiddenParams` * [nextflow_config]( - Config variable found: `params.validationSchemaIgnoreParams` * [nextflow_config]( - Config variable found: `manifest.mainScript` * [nextflow_config]( - Config variable found: `timeline.file` * [nextflow_config]( - Config variable found: `trace.file` * [nextflow_config]( - Config variable found: `report.file` * [nextflow_config]( - Config variable found: `dag.file` * [nextflow_config]( - Config variable (correctly) not found: `params.nf_required_version` * [nextflow_config]( - Config variable (correctly) not found: `params.container` * [nextflow_config]( - Config variable (correctly) not found: `params.singleEnd` * [nextflow_config]( - Config variable (correctly) not found: `params.igenomesIgnore` * [nextflow_config]( - Config variable (correctly) not found: `` * [nextflow_config]( - Config variable (correctly) not found: `params.enable_conda` * [nextflow_config]( - Config ``timeline.enabled`` had correct value: ``true`` * [nextflow_config]( - Config ``report.enabled`` had correct value: ``true`` * [nextflow_config]( - Config ``trace.enabled`` had correct value: ``true`` * [nextflow_config]( - Config ``dag.enabled`` had correct value: ``true`` * [nextflow_config]( - Config ```` began with ``nf-core/`` * [nextflow_config]( - Config variable ``manifest.homePage`` began with * [nextflow_config]( - Config ``dag.file`` ended with ``.html`` * [nextflow_config]( - Config variable ``manifest.nextflowVersion`` started with >= or !>= * [nextflow_config]( - Config ``manifest.version`` ends in ``dev``: ``3.0.0dev`` * [nextflow_config]( - Config `params.custom_config_version` is set to `master` * [nextflow_config]( - Config `params.custom_config_base` is set to `` * [nextflow_config]( - Lines for loading custom profiles found * [nextflow_config]( - nextflow.config contains configuration profile `test` * [nextflow_config]( - Config default value correct: params.igenomes_base= s3://ngi-igenomes/igenomes/ * [nextflow_config]( - Config default value correct: params.custom_config_version= master * [nextflow_config]( - Config default value correct: params.custom_config_base= * [nextflow_config]( - Config default value correct: params.max_cpus= 16 * [nextflow_config]( - Config default value correct: params.max_memory= 128.GB * [nextflow_config]( - Config default value correct: params.max_time= 240.h * [nextflow_config]( - Config default value correct: params.publish_dir_mode= copy * [nextflow_config]( - Config default value correct: params.max_multiqc_email_size= 25.MB * [nextflow_config]( - Config default value correct: params.validate_params= true * [nextflow_config]( - Config default value correct: params.pipelines_testdata_base_path= * [nextflow_config]( - Config default value correct: params.sequencing_qc_tool= fastqc * [nextflow_config]( - Config default value correct: params.preprocessing_tool= fastp * [nextflow_config]( - Config default value correct: params.preprocessing_minlength= 25 * [nextflow_config]( - Config default value correct: params.preprocessing_trim5p= 0 * [nextflow_config]( - Config default value correct: params.preprocessing_trim3p= 0 * [nextflow_config]( - Config default value correct: params.preprocessing_fastp_complexityfilter_threshold= 10 * [nextflow_config]( - Config default value correct: params.preprocessing_adapterremoval_trimbasequalitymin= 20 * [nextflow_config]( - Config default value correct: params.preprocessing_adapterremoval_adapteroverlap= 1 * [nextflow_config]( - Config default value correct: params.preprocessing_adapterremoval_qualitymax= 41 * [nextflow_config]( - Config default value correct: params.fastq_shard_size= 1000000 * [nextflow_config]( - Config default value correct: params.mapping_tool= bwaaln * [nextflow_config]( - Config default value correct: params.mapping_bwaaln_n= 0.01 * [nextflow_config]( - Config default value correct: params.mapping_bwaaln_k= 2 * [nextflow_config]( - Config default value correct: params.mapping_bwaaln_l= 1024 * [nextflow_config]( - Config default value correct: params.mapping_bwaaln_o= 2 * [nextflow_config]( - Config default value correct: params.mapping_bwamem_k= 19 * [nextflow_config]( - Config default value correct: params.mapping_bwamem_r= 1.5 * [nextflow_config]( - Config default value correct: params.mapping_bowtie2_alignmode= local * [nextflow_config]( - Config default value correct: params.mapping_bowtie2_sensitivity= sensitive * [nextflow_config]( - Config default value correct: params.mapping_bowtie2_n= 0 * [nextflow_config]( - Config default value correct: params.mapping_bowtie2_l= 20 * [nextflow_config]( - Config default value correct: params.mapping_bowtie2_trim5= 0 * [nextflow_config]( - Config default value correct: params.mapping_bowtie2_trim3= 0 * [nextflow_config]( - Config default value correct: params.mapping_bowtie2_maxins= 500 * [nextflow_config]( - Config default value correct: params.bamfiltering_minreadlength= 0 * [nextflow_config]( - Config default value correct: params.bamfiltering_mappingquality= 0 * [nextflow_config]( - Config default value correct: params.bamfilter_genomicbamfilterflag= 4 * [nextflow_config]( - Config default value correct: params.metagenomics_input= unmapped * [nextflow_config]( - Config default value correct: params.metagenomics_complexity_tool= bbduk * [nextflow_config]( - Config default value correct: params.metagenomics_complexity_entropy= 0.3 * [nextflow_config]( - Config default value correct: params.metagenomics_prinseq_mode= entropy * [nextflow_config]( - Config default value correct: params.metagenomics_prinseq_dustscore= 0.5 * [nextflow_config]( - Config default value correct: params.metagenomics_krakenuniq_ramchunksize= 16G * [nextflow_config]( - Config default value correct: params.metagenomics_malt_mode= BlastN * [nextflow_config]( - Config default value correct: params.metagenomics_malt_alignmentmode= SemiGlobal * [nextflow_config]( - Config default value correct: params.metagenomics_malt_minpercentidentity= 85 * [nextflow_config]( - Config default value correct: params.metagenomics_malt_toppercent= 1 * [nextflow_config]( - Config default value correct: params.metagenomics_malt_minsupportmode= percent * [nextflow_config]( - Config default value correct: params.metagenomics_malt_minsupportpercent= 0.01 * [nextflow_config]( - Config default value correct: params.metagenomics_minsupportreads= 1 * [nextflow_config]( - Config default value correct: params.metagenomics_malt_maxqueries= 100 * [nextflow_config]( - Config default value correct: params.metagenomics_malt_memorymode= load * [nextflow_config]( - Config default value correct: params.metagenomics_malt_group_size= 0 * [nextflow_config]( - Config default value correct: params.metagenomics_maltextract_filter= def_anc * [nextflow_config]( - Config default value correct: params.metagenomics_maltextract_toppercent= 0.01 * [nextflow_config]( - Config default value correct: params.metagenomics_maltextract_minpercentidentity= 85.0 * [nextflow_config]( - Config default value correct: params.deduplication_tool= markduplicates * [nextflow_config]( - Config default value correct: params.damage_manipulation_rescale_seqlength= 12 * [nextflow_config]( - Config default value correct: params.damage_manipulation_rescale_length_5p= 0 * [nextflow_config]( - Config default value correct: params.damage_manipulation_rescale_length_3p= 0 * [nextflow_config]( - Config default value correct: params.damage_manipulation_pmdtools_threshold= 3 * [nextflow_config]( - Config default value correct: params.damage_manipulation_bamutils_trim_double_stranded_none_udg_left= 0 * [nextflow_config]( - Config default value correct: params.damage_manipulation_bamutils_trim_double_stranded_none_udg_right= 0 * [nextflow_config]( - Config default value correct: params.damage_manipulation_bamutils_trim_double_stranded_half_udg_left= 0 * [nextflow_config]( - Config default value correct: params.damage_manipulation_bamutils_trim_double_stranded_half_udg_right= 0 * [nextflow_config]( - Config default value correct: params.damage_manipulation_bamutils_trim_single_stranded_none_udg_left= 0 * [nextflow_config]( - Config default value correct: params.damage_manipulation_bamutils_trim_single_stranded_none_udg_right= 0 * [nextflow_config]( - Config default value correct: params.damage_manipulation_bamutils_trim_single_stranded_half_udg_left= 0 * [nextflow_config]( - Config default value correct: params.damage_manipulation_bamutils_trim_single_stranded_half_udg_right= 0 * [nextflow_config]( - Config default value correct: params.genotyping_reference_ploidy= 2 * [nextflow_config]( - Config default value correct: params.genotyping_pileupcaller_min_base_quality= 30 * [nextflow_config]( - Config default value correct: params.genotyping_pileupcaller_min_map_quality= 30 * [nextflow_config]( - Config default value correct: params.genotyping_pileupcaller_method= randomHaploid * [nextflow_config]( - Config default value correct: params.genotyping_pileupcaller_transitions_mode= AllSites * [nextflow_config]( - Config default value correct: params.genotyping_gatk_call_conf= 30 * [nextflow_config]( - Config default value correct: params.genotyping_gatk_ug_downsample= 250 * [nextflow_config]( - Config default value correct: params.genotyping_gatk_ug_out_mode= EMIT_VARIANTS_ONLY * [nextflow_config]( - Config default value correct: params.genotyping_gatk_ug_genotype_mode= SNP * [nextflow_config]( - Config default value correct: params.genotyping_gatk_ug_defaultbasequalities= -1 * [nextflow_config]( - Config default value correct: params.genotyping_gatk_hc_out_mode= EMIT_VARIANTS_ONLY * [nextflow_config]( - Config default value correct: params.genotyping_gatk_hc_emitrefconf= GVCF * [nextflow_config]( - Config default value correct: params.genotyping_freebayes_min_alternate_count= 1 * [nextflow_config]( - Config default value correct: params.genotyping_freebayes_skip_coverage= 0 * [nextflow_config]( - Config default value correct: params.genotyping_angsd_glmodel= samtools * [nextflow_config]( - Config default value correct: params.genotyping_angsd_glformat= binary * [nextflow_config]( - Config default value correct: params.mitochondrion_header= MT * [nextflow_config]( - Config default value correct: params.mapstats_preseq_mode= c_curve * [nextflow_config]( - Config default value correct: params.mapstats_preseq_stepsize= 1000 * [nextflow_config]( - Config default value correct: params.mapstats_preseq_terms= 100 * [nextflow_config]( - Config default value correct: params.mapstats_preseq_maxextrap= 10000000000 * [nextflow_config]( - Config default value correct: params.mapstats_preseq_bootstrap= 100 * [nextflow_config]( - Config default value correct: params.mapstats_preseq_cval= 0.95 * [nextflow_config]( - Config default value correct: params.damagecalculation_tool= damageprofiler * [nextflow_config]( - Config default value correct: params.damagecalculation_yaxis= 0.3 * [nextflow_config]( - Config default value correct: params.damagecalculation_xaxis= 25 * [nextflow_config]( - Config default value correct: params.damagecalculation_damageprofiler_length= 100 * [nextflow_config]( - Config default value correct: params.damagecalculation_mapdamage_downsample= 0 * [nextflow_config]( - Config default value correct: params.host_removal_mode= remove * [nextflow_config]( - Config default value correct: params.contamination_estimation_angsd_chrom_name= X * [nextflow_config]( - Config default value correct: params.contamination_estimation_angsd_range_from= 5000000 * [nextflow_config]( - Config default value correct: params.contamination_estimation_angsd_range_to= 154900000 * [nextflow_config]( - Config default value correct: params.contamination_estimation_angsd_mapq= 30 * [nextflow_config]( - Config default value correct: params.contamination_estimation_angsd_minq= 30 * [files_unchanged]( - `.gitattributes` matches the template * [files_unchanged]( - `.prettierrc.yml` matches the template * [files_unchanged]( - `` matches the template * [files_unchanged]( - `LICENSE` matches the template * [files_unchanged]( - `.github/.dockstore.yml` matches the template * [files_unchanged]( - `.github/` matches the template * [files_unchanged]( - `.github/ISSUE_TEMPLATE/bug_report.yml` matches the template * [files_unchanged]( - `.github/ISSUE_TEMPLATE/config.yml` matches the template * [files_unchanged]( - `.github/ISSUE_TEMPLATE/feature_request.yml` matches the template * [files_unchanged]( - `.github/` matches the template * [files_unchanged]( - `.github/workflows/branch.yml` matches the template * [files_unchanged]( - `.github/workflows/linting_comment.yml` matches the template * [files_unchanged]( - `.github/workflows/linting.yml` matches the template * [files_unchanged]( - `assets/email_template.html` matches the template * [files_unchanged]( - `assets/email_template.txt` matches the template * [files_unchanged]( - `assets/sendmail_template.txt` matches the template * [files_unchanged]( - `assets/nf-core-eager_logo_light.png` matches the template * [files_unchanged]( - `docs/images/nf-core-eager_logo_light.png` matches the template * [files_unchanged]( - `docs/images/nf-core-eager_logo_dark.png` matches the template * [files_unchanged]( - `docs/` matches the template * [files_unchanged]( - `.gitignore` matches the template * [files_unchanged]( - `.prettierignore` matches the template * [actions_ci]( - '.github/workflows/ci.yml' is triggered on expected events * [actions_ci]( - '.github/workflows/ci.yml' checks minimum NF version * [actions_awstest]( - '.github/workflows/awstest.yml' is triggered correctly * [actions_awsfulltest]( - `.github/workflows/awsfulltest.yml` is triggered correctly * [actions_awsfulltest]( - `.github/workflows/awsfulltest.yml` does not use `-profile test` * [readme]( - README Nextflow minimum version badge matched config. Badge: `23.04.0`, Config: `23.04.0` * [pipeline_name_conventions]( - Name adheres to nf-core convention * [template_strings]( - Did not find any Jinja template strings (335 files) * [schema_lint]( - Schema lint passed * [schema_lint]( - Schema title + description lint passed * [schema_lint]( - Input mimetype lint passed: 'text/csv' * [schema_params]( - Schema matched params returned from nextflow config * [system_exit]( - No `System.exit` calls found * [actions_schema_validation]( - Workflow validation passed: awsfulltest.yml * [actions_schema_validation]( - Workflow validation passed: fix-linting.yml * [actions_schema_validation]( - Workflow validation passed: branch.yml * [actions_schema_validation]( - Workflow validation passed: linting_comment.yml * [actions_schema_validation]( - Workflow validation passed: awstest.yml * [actions_schema_validation]( - Workflow validation passed: linting.yml * [actions_schema_validation]( - Workflow validation passed: clean-up.yml * [actions_schema_validation]( - Workflow validation passed: release-announcements.yml * [actions_schema_validation]( - Workflow validation passed: download_pipeline.yml * [actions_schema_validation]( - Workflow validation passed: ci.yml * [merge_markers]( - No merge markers found in pipeline files * [modules_json]( - Only installed modules found in `modules.json` * [multiqc_config]( - `assets/multiqc_config.yml` found and not ignored. * [multiqc_config]( - `assets/multiqc_config.yml` contains `report_section_order` * [multiqc_config]( - `assets/multiqc_config.yml` contains `export_plots` * [multiqc_config]( - `assets/multiqc_config.yml` contains `report_comment` * [multiqc_config]( - `assets/multiqc_config.yml` follows the ordering scheme of the minimally required plugins. * [multiqc_config]( - `assets/multiqc_config.yml` contains a matching 'report_comment'. * [multiqc_config]( - `assets/multiqc_config.yml` contains 'export_plots: true'. * [modules_structure]( - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL' * [base_config]( - `conf/base.config` found and not ignored. * [modules_config]( - `conf/modules.config` found and not ignored. * [modules_config]( - `SAMTOOLS_CONVERT_BAM_INPUT` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `CAT_FASTQ_CONVERTED_BAM` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `FASTQC` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `FASTQC_PROCESSED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `MULTIQC` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `FALCO` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `FALCO_PROCESSED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `FASTP_SINGLE` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `FASTP_PAIRED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `ADAPTERREMOVAL_SINGLE` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `ADAPTERREMOVAL_PAIRED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `CAT_FASTQ_ADAPTERREMOVAL` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `GUNZIP_FASTA` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `GUNZIP_PMDFASTA` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_FAIDX` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `PICARD_CREATESEQUENCEDICTIONARY` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `BOWTIE2_BUILD` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `BWA_INDEX` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_FLAGSTATS_BAM_INPUT` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_INDEX_BAM_INPUT` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `CAT_FASTQ_UNMAPPED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `FILTER_BAM_FRAGMENT_LENGTH` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_FASTQ_UNMAPPED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_VIEW_BAM_FILTERING` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_LENGTH_FILTER_INDEX` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_FASTQ_MAPPED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_FLAGSTAT_FILTERED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SEQKIT_SPLIT2` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `BWA_ALN` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `BWA_SAMSE` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `ENDORSPY` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `BWA_MEM` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `BOWTIE2_ALIGN` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_INDEX_MEM` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_MERGE_LANES` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_SORT_MERGED_LANES` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_INDEX_MERGED_LANES` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_FLAGSTAT_MAPPED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `PICARD_MARKDUPLICATES` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `DEDUP` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_MERGE_DEDUPPED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_SORT_DEDUPPED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_INDEX_DEDUPPED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_FLAGSTAT_DEDUPPED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `HOST_REMOVAL` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `PRESEQ_CCURVE` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `PRESEQ_LCEXTRAP` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_VIEW_GENOME` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `BEDTOOLS_COVERAGE_DEPTH` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `BEDTOOLS_COVERAGE_BREADTH` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `BEDTOOLS_MASKFASTA` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `MAPDAMAGE2` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_INDEX_DAMAGE_RESCALED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `PMDTOOLS_FILTER` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_INDEX_DAMAGE_FILTERED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_FLAGSTAT_DAMAGE_FILTERED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `BAMUTIL_TRIMBAM` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_INDEX_DAMAGE_TRIMMED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `ANGSD_DOCOUNTS` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `ANGSD_CONTAMINATION` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `PRINT_CONTAMINATION_ANGSD` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `MTNUCRATIO` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `PRINSEQPLUSPLUS` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `MALT_RUN` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `CAT_CAT_MALT` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `KRAKEN2_KRAKEN2` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `METAPHLAN_METAPHLAN` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `MALTEXTRACT` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `MEGAN_RMA2INFO` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `AMPS` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `TAXPASTA_MERGE` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `TAXPASTA_STANDARDISE` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `QUALIMAP_BAMQC_WITHBED` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `DAMAGEPROFILER` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `CALCULATE_MAPDAMAGE2` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_DEPTH_SEXDETERRMINE` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SEXDETERRMINE` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_MERGE_LIBRARIES` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_SORT_MERGED_LIBRARIES` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_INDEX_MERGED_LIBRARIES` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_FLAGSTAT_MERGED_LIBRARIES` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SAMTOOLS_MPILEUP_PILEUPCALLER` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `SEQUENCETOOLS_PILEUPCALLER` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `COLLECT_GENOTYPES` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `EIGENSTRATDATABASETOOLS_EIGENSTRATSNPCOVERAGE` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `GATK_REALIGNERTARGETCREATOR` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `GATK_INDELREALIGNER` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `GATK_UNIFIEDGENOTYPER` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `BCFTOOLS_INDEX_UG` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `GATK4_HAPLOTYPECALLER` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `FREEBAYES` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `BCFTOOLS_INDEX_FREEBAYES` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `BCFTOOLS_STATS_GENOTYPING` found in `conf/modules.config` and Nextflow scripts. * [modules_config]( - `ANGSD_GL` found in `conf/modules.config` and Nextflow scripts. * [nfcore_yml]( - Repository type in `.nf-core.yml` is valid: `pipeline` * [nfcore_yml]( - nf-core version in `.nf-core.yml` is set to the latest version: `2.14.1` ### Run details * nf-core/tools version 2.14.1 * Run at `2024-07-05 09:57:14`
ilight1542 commented 4 months ago

RE: keeping strandedness. Since the only meta in malt-run is the meta with the list of read files, keeping info on which samples have single-stranded library prep must be done in multiple malt runs (unless we want to slightly rewrite the malt-run module.

Unless we can keep the meta info and then somehow remerge it with the various rma6 files channel that we get from MALT.out.rma6, I think we need to split the rma6 files by strandedness first and then send them into malt

Possibility for maintaining strandedness info for downstream maltextract:

(within reads .branch { doublestranded: it[0].strandedness == 'double' singlestranded: it[0].strandedness == 'single' }.set { strandedness_ch }

merszym commented 3 months ago

RE: keeping strandedness. Since the only meta in malt-run is the meta with the list of read files, keeping info on which samples have single-stranded library prep must be done in multiple malt runs (unless we want to slightly rewrite the malt-run module.

Unless we can keep the meta info and then somehow remerge it with the various rma6 files channel that we get from MALT.out.rma6, I think we need to split the rma6 files by strandedness first and then send them into malt

Possibility for maintaining strandedness info for downstream maltextract:

(within reads .branch { doublestranded: it[0].strandedness == 'double' singlestranded: it[0].strandedness == 'single' }.set { strandedness_ch }

The only downstream process that relies on strandedness information is maltextract, so we should branch as late as possible (After MALT) and concat the channels directly afterwards.

Problem: The maltextract-module doesnt take a meta map in the input channels... Solution: Update the module

merszym commented 3 months ago

RE: keeping strandedness. Since the only meta in malt-run is the meta with the list of read files, keeping info on which samples have single-stranded library prep must be done in multiple malt runs (unless we want to slightly rewrite the malt-run module. Unless we can keep the meta info and then somehow remerge it with the various rma6 files channel that we get from MALT.out.rma6, I think we need to split the rma6 files by strandedness first and then send them into malt Possibility for maintaining strandedness info for downstream maltextract: (within reads .branch { doublestranded: it[0].strandedness == 'double' singlestranded: it[0].strandedness == 'single' }.set { strandedness_ch }

The only downstream process that relies on strandedness information is maltextract, so we should branch as late as possible (After MALT) and concat the channels directly afterwards.

Problem: The maltextract-module doesnt take a meta map in the input channels... Solution: Update the module

And finally...

merszym commented 3 months ago

I would bundle all documentation-related comments into a separate issue, so that we can merge the (working) branch into dev and then finish on the documentation "on top". So that we can do that without going through all the files again and again and without diverging from the dev branch.

merszym commented 2 months ago

Open ToDos from code review (after test profiles)

If I missed anything, please correct me

ilight1542 commented 2 weeks ago

@jfy133 -- I think it is all set for review once more: a quick update RE: strandedness going into metagenomics screening. The current way that bamfiltering is done, the per-sample outputs (eg mapped R1, R2, singletons, unmapped... ) are always concatenated into a single channel and run independently.

Major revision would be required in the parsing of I/O from bamfiltering into metagenomics to get it working also while maintaining metadata for PE reads. Merlin and I feel this is more appropriate as a separate PR/extension.

merszym commented 2 weeks ago

@jfy133 -- I think it is all set for review once more: a quick update RE: strandedness going into metagenomics screening. The current way that bamfiltering is done, the per-sample outputs (eg mapped R1, R2, singletons, unmapped... ) are always concatenated into a single channel and run independently.

Major revision would be required in the parsing of I/O from bamfiltering into metagenomics to get it working also while maintaining metadata for PE reads. Merlin and I feel this is more appropriate as a separate PR/extension.

(not strandedness as in double/single stranded libraries, but in sequencing mode (paired end, single read))

Currently all channels coming into the metagenomics have the single_end=true paramter in the meta.