nf-core / raredisease

Call and score variants from WGS/WES of rare disease patients.
MIT License
84 stars 34 forks source link

Refactoring + other issues with cadd and samtools merge #538

Closed ramprasadn closed 4 months ago

ramprasadn commented 5 months ago

PR checklist

github-actions[bot] commented 5 months ago

nf-core lint overall result: Passed :white_check_mark: :warning:

Posted for pipeline commit 9da1d8d

+| ✅ 183 tests passed       |+
#| ❔   7 tests were ignored |#
!| ❗   1 tests had warnings |!
### :heavy_exclamation_mark: Test warnings: * [pipeline_todos]( - TODO string in `awsfulltest.yml`: _You can customise AWS full pipeline tests as required_ ### :grey_question: Tests ignored: * [files_exist]( - File is ignored: `conf/modules.config` * [files_unchanged]( - File ignored due to lint config: `.github/` * [files_unchanged]( - File ignored due to lint config: `.github/` * [files_unchanged]( - File ignored due to lint config: `assets/nf-core-raredisease_logo_light.png` * [files_unchanged]( - File ignored due to lint config: `docs/images/nf-core-raredisease_logo_light.png` * [files_unchanged]( - File ignored due to lint config: `docs/images/nf-core-raredisease_logo_dark.png` * [modules_config]( - modules_config ### :white_check_mark: Tests passed: * [files_exist]( - File found: `.gitattributes` * [files_exist]( - File found: `.gitignore` * [files_exist]( - File found: `.nf-core.yml` * [files_exist]( - File found: `.editorconfig` * [files_exist]( - File found: `.prettierignore` * [files_exist]( - File found: `.prettierrc.yml` * [files_exist]( - File found: `` * [files_exist]( - File found: `` * [files_exist]( - File found: `` * [files_exist]( - File found: `LICENSE` or `` or `LICENCE` or `` * [files_exist]( - File found: `nextflow_schema.json` * [files_exist]( - File found: `nextflow.config` * [files_exist]( - File found: `` * [files_exist]( - File found: `.github/.dockstore.yml` * [files_exist]( - File found: `.github/` * [files_exist]( - File found: `.github/ISSUE_TEMPLATE/bug_report.yml` * [files_exist]( - File found: `.github/ISSUE_TEMPLATE/config.yml` * [files_exist]( - File found: `.github/ISSUE_TEMPLATE/feature_request.yml` * [files_exist]( - File found: `.github/` * [files_exist]( - File found: `.github/workflows/branch.yml` * [files_exist]( - File found: `.github/workflows/ci.yml` * [files_exist]( - File found: `.github/workflows/linting_comment.yml` * [files_exist]( - File found: `.github/workflows/linting.yml` * [files_exist]( - File found: `assets/email_template.html` * [files_exist]( - File found: `assets/email_template.txt` * [files_exist]( - File found: `assets/sendmail_template.txt` * [files_exist]( - File found: `assets/nf-core-raredisease_logo_light.png` * [files_exist]( - File found: `conf/test.config` * [files_exist]( - File found: `conf/test_full.config` * [files_exist]( - File found: `docs/images/nf-core-raredisease_logo_light.png` * [files_exist]( - File found: `docs/images/nf-core-raredisease_logo_dark.png` * [files_exist]( - File found: `docs/` * [files_exist]( - File found: `docs/` * [files_exist]( - File found: `docs/` * [files_exist]( - File found: `docs/` * [files_exist]( - File found: `` * [files_exist]( - File found: `assets/multiqc_config.yml` * [files_exist]( - File found: `conf/base.config` * [files_exist]( - File found: `conf/igenomes.config` * [files_exist]( - File found: `.github/workflows/awstest.yml` * [files_exist]( - File found: `.github/workflows/awsfulltest.yml` * [files_exist]( - File found: `modules.json` * [files_exist]( - File not found check: `.github/ISSUE_TEMPLATE/` * [files_exist]( - File not found check: `.github/ISSUE_TEMPLATE/` * [files_exist]( - File not found check: `.github/workflows/push_dockerhub.yml` * [files_exist]( - File not found check: `.markdownlint.yml` * [files_exist]( - File not found check: `.nf-core.yaml` * [files_exist]( - File not found check: `.yamllint.yml` * [files_exist]( - File not found check: `bin/markdown_to_html.r` * [files_exist]( - File not found check: `conf/aws.config` * [files_exist]( - File not found check: `docs/images/nf-core-raredisease_logo.png` * [files_exist]( - File not found check: `lib/Checks.groovy` * [files_exist]( - File not found check: `lib/Completion.groovy` * [files_exist]( - File not found check: `lib/NfcoreTemplate.groovy` * [files_exist]( - File not found check: `lib/Utils.groovy` * [files_exist]( - File not found check: `lib/Workflow.groovy` * [files_exist]( - File not found check: `lib/WorkflowMain.groovy` * [files_exist]( - File not found check: `lib/WorkflowRaredisease.groovy` * [files_exist]( - File not found check: `parameters.settings.json` * [files_exist]( - File not found check: `pipeline_template.yml` * [files_exist]( - File not found check: `Singularity` * [files_exist]( - File not found check: `lib/nfcore_external_java_deps.jar` * [files_exist]( - File not found check: `.travis.yml` * [nextflow_config]( - Config variable found: `` * [nextflow_config]( - Config variable found: `manifest.nextflowVersion` * [nextflow_config]( - Config variable found: `manifest.description` * [nextflow_config]( - Config variable found: `manifest.version` * [nextflow_config]( - Config variable found: `manifest.homePage` * [nextflow_config]( - Config variable found: `timeline.enabled` * [nextflow_config]( - Config variable found: `trace.enabled` * [nextflow_config]( - Config variable found: `report.enabled` * [nextflow_config]( - Config variable found: `dag.enabled` * [nextflow_config]( - Config variable found: `process.cpus` * [nextflow_config]( - Config variable found: `process.memory` * [nextflow_config]( - Config variable found: `process.time` * [nextflow_config]( - Config variable found: `params.outdir` * [nextflow_config]( - Config variable found: `params.input` * [nextflow_config]( - Config variable found: `params.validationShowHiddenParams` * [nextflow_config]( - Config variable found: `params.validationSchemaIgnoreParams` * [nextflow_config]( - Config variable found: `manifest.mainScript` * [nextflow_config]( - Config variable found: `timeline.file` * [nextflow_config]( - Config variable found: `trace.file` * [nextflow_config]( - Config variable found: `report.file` * [nextflow_config]( - Config variable found: `dag.file` * [nextflow_config]( - Config variable (correctly) not found: `params.nf_required_version` * [nextflow_config]( - Config variable (correctly) not found: `params.container` * [nextflow_config]( - Config variable (correctly) not found: `params.singleEnd` * [nextflow_config]( - Config variable (correctly) not found: `params.igenomesIgnore` * [nextflow_config]( - Config variable (correctly) not found: `` * [nextflow_config]( - Config variable (correctly) not found: `params.enable_conda` * [nextflow_config]( - Config ``timeline.enabled`` had correct value: ``true`` * [nextflow_config]( - Config ``report.enabled`` had correct value: ``true`` * [nextflow_config]( - Config ``trace.enabled`` had correct value: ``true`` * [nextflow_config]( - Config ``dag.enabled`` had correct value: ``true`` * [nextflow_config]( - Config ```` began with ``nf-core/`` * [nextflow_config]( - Config variable ``manifest.homePage`` began with * [nextflow_config]( - Config ``dag.file`` ended with ``.html`` * [nextflow_config]( - Config variable ``manifest.nextflowVersion`` started with >= or !>= * [nextflow_config]( - Config ``manifest.version`` ends in ``dev``: ``2.1.0dev`` * [nextflow_config]( - Config `params.custom_config_version` is set to `master` * [nextflow_config]( - Config `params.custom_config_base` is set to `` * [nextflow_config]( - Lines for loading custom profiles found * [nextflow_config]( - nextflow.config contains configuration profile `test` * [nextflow_config]( - Config default value correct: params.bait_padding= 100 * [nextflow_config]( - Config default value correct: params.genome= GRCh38 * [nextflow_config]( - Config default value correct: params.mito_name= chrM * [nextflow_config]( - Config default value correct: params.analysis_type= wgs * [nextflow_config]( - Config default value correct: params.platform= illumina * [nextflow_config]( - Config default value correct: params.ngsbits_samplegender_method= xy * [nextflow_config]( - Config default value correct: params.skip_vcf2cytosure= true * [nextflow_config]( - Config default value correct: params.aligner= bwamem2 * [nextflow_config]( - Config default value correct: params.min_trimmed_length= 40 * [nextflow_config]( - Config default value correct: params.mt_subsample_rd= 150 * [nextflow_config]( - Config default value correct: params.mt_subsample_seed= 30 * [nextflow_config]( - Config default value correct: params.cnvnator_binsize= 1000 * [nextflow_config]( - Config default value correct: params.sentieon_dnascope_pcr_indel_model= CONSERVATIVE * [nextflow_config]( - Config default value correct: params.variant_caller= deepvariant * [nextflow_config]( - Config default value correct: params.variant_type= snp,indel * [nextflow_config]( - Config default value correct: params.vep_cache_version= 110 * [nextflow_config]( - Config default value correct: params.custom_config_version= master * [nextflow_config]( - Config default value correct: params.custom_config_base= * [nextflow_config]( - Config default value correct: params.max_cpus= 16 * [nextflow_config]( - Config default value correct: params.max_memory= 128.GB * [nextflow_config]( - Config default value correct: params.max_time= 240.h * [nextflow_config]( - Config default value correct: params.publish_dir_mode= copy * [nextflow_config]( - Config default value correct: params.max_multiqc_email_size= 25.MB * [nextflow_config]( - Config default value correct: params.validate_params= true * [nextflow_config]( - Config default value correct: params.pipelines_testdata_base_path= * [files_unchanged]( - `.gitattributes` matches the template * [files_unchanged]( - `.prettierrc.yml` matches the template * [files_unchanged]( - `` matches the template * [files_unchanged]( - `LICENSE` matches the template * [files_unchanged]( - `.github/.dockstore.yml` matches the template * [files_unchanged]( - `.github/ISSUE_TEMPLATE/bug_report.yml` matches the template * [files_unchanged]( - `.github/ISSUE_TEMPLATE/config.yml` matches the template * [files_unchanged]( - `.github/ISSUE_TEMPLATE/feature_request.yml` matches the template * [files_unchanged]( - `.github/workflows/branch.yml` matches the template * [files_unchanged]( - `.github/workflows/linting_comment.yml` matches the template * [files_unchanged]( - `.github/workflows/linting.yml` matches the template * [files_unchanged]( - `assets/email_template.html` matches the template * [files_unchanged]( - `assets/email_template.txt` matches the template * [files_unchanged]( - `assets/sendmail_template.txt` matches the template * [files_unchanged]( - `docs/` matches the template * [files_unchanged]( - `.gitignore` matches the template * [files_unchanged]( - `.prettierignore` matches the template * [actions_ci]( - '.github/workflows/ci.yml' is triggered on expected events * [actions_ci]( - '.github/workflows/ci.yml' checks minimum NF version * [actions_awstest]( - '.github/workflows/awstest.yml' is triggered correctly * [actions_awsfulltest]( - `.github/workflows/awsfulltest.yml` is triggered correctly * [actions_awsfulltest]( - `.github/workflows/awsfulltest.yml` does not use `-profile test` * [readme]( - README Nextflow minimum version badge matched config. Badge: `23.04.0`, Config: `23.04.0` * [readme]( - README Zenodo placeholder was replaced with DOI. * [pipeline_name_conventions]( - Name adheres to nf-core convention * [template_strings]( - Did not find any Jinja template strings (643 files) * [schema_lint]( - Schema lint passed * [schema_lint]( - Schema title + description lint passed * [schema_lint]( - Input mimetype lint passed: 'text/csv' * [schema_params]( - Schema matched params returned from nextflow config * [system_exit]( - No `System.exit` calls found * [actions_schema_validation]( - Workflow validation passed: branch.yml * [actions_schema_validation]( - Workflow validation passed: ci.yml * [actions_schema_validation]( - Workflow validation passed: awsfulltest.yml * [actions_schema_validation]( - Workflow validation passed: fix-linting.yml * [actions_schema_validation]( - Workflow validation passed: linting.yml * [actions_schema_validation]( - Workflow validation passed: download_pipeline.yml * [actions_schema_validation]( - Workflow validation passed: release-announcements.yml * [actions_schema_validation]( - Workflow validation passed: clean-up.yml * [actions_schema_validation]( - Workflow validation passed: awstest.yml * [actions_schema_validation]( - Workflow validation passed: linting_comment.yml * [merge_markers]( - No merge markers found in pipeline files * [modules_json]( - Only installed modules found in `modules.json` * [multiqc_config]( - `assets/multiqc_config.yml` found and not ignored. * [multiqc_config]( - `assets/multiqc_config.yml` contains `report_section_order` * [multiqc_config]( - `assets/multiqc_config.yml` contains `export_plots` * [multiqc_config]( - `assets/multiqc_config.yml` contains `report_comment` * [multiqc_config]( - `assets/multiqc_config.yml` follows the ordering scheme of the minimally required plugins. * [multiqc_config]( - `assets/multiqc_config.yml` contains a matching 'report_comment'. * [multiqc_config]( - `assets/multiqc_config.yml` contains 'export_plots: true'. * [modules_structure]( - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL' * [base_config]( - `conf/base.config` found and not ignored. * [base_config]( - `NFCORE_RAREDISEASE` found in `conf/base.config` and Nextflow scripts. * [nfcore_yml]( - Repository type in `.nf-core.yml` is valid: `pipeline` * [nfcore_yml]( - nf-core version in `.nf-core.yml` is set to the latest version: `2.14.1` ### Run details * nf-core/tools version 2.14.1 * Run at `2024-05-17 08:44:15`
fa2k commented 5 months ago

The test profile works perfectly for this patch branch on RHEL 9 and Fedora 39 with singularity and local executor (no executor/cluster setting). [I am testing 2.0.1 with a full-scale dataset and will also test the patch branch, if it can complete in time]

Jakob37 commented 5 months ago

Hi! I am trying out running this patch PR on the dataset I have been running before (i.e. GIAB on our server).

I run into errors, trying to see whether they are on my part or in the code.

ERROR ~ Cannot get property 'case_id' on null object

 -- Check script '<path>/raredisease/./workflows/' at line: 166 or see '.nextflow.log' file for more details
ERROR ~ No such variable: Exception evaluating property 'gtbi' for nextflow.script.ChannelOut, Reason: groovy.lang.MissingPropertyException: No such property: gtbi for class: groovyx.gpars.dataflow.DataflowBroadcast

 -- Check script '<path>/raredisease/./workflows/../subworkflows/local/' at line: 77 or see '.nextflow.log' file for more details

In a quick look it indeed looks like there is a mismatch for the Sentieon workflow.

Here is what is asked for from the "parent" processes using the Sentieon SNV calling subprocess.

Here are the emitted arguments. Maybe a typo for gvcf_tbi?

Need to run for a meeting now, will continue testing (and checking the case error) later today

jemten commented 5 months ago

Here are the emitted arguments. Maybe a typo for gvcf_tbi?

indeed looks like a typo. Thanks for testing and reporting

Jakob37 commented 5 months ago

Quick update. I am running into more downstream issues, but I think these are on my side. I will continue working through the GIAB run and raise any issues I find, but will probably complete when you are done with this PR.

I would like to run the test data, but we have an offline-only server, and it is not feasible to pull all containers to my local computer.

fa2k commented 4 months ago

I've tested the most recent version of this PR and I do get an error. I made a separate issue because I also get the error in pipeline version 2.0.x #542 . The error is this:


Join mismatch for the following entries:
- key=[id:NA12878, sample:NA12878, lane:1, sex:2, phenotype:2, paternal:0, maternal:0, case_id:NA12878, num_lanes:1, read_group:'@RG\tID:NA12878\tPL:illumina\tSM:NA12878', single_end:false, interval:chr13, nr_of_intervals:25] values=[/data0/paalmbj/na12878_med_pipeline/work/83/20f04adedf5dc4bb5ee4b6b85e262c/NA12878_chr13.bam, /data0/paalmbj/na12878_med_pipeline/work/d0/156dd6f94e30b5c404ec292aad5b8c/NA12878_chr13.bam.bai]