Closed elisabethgoldman closed 4 months ago
It looks like the tuple is being treated as a single parameter, as opposed to 5 separate parameters.
We can update the input
section to handle this with the tuple qualifier:
// Define the Mutect2 process
process mutect2 {
...
- input:
- val chrom
- path mutect_idx
- path tumor_bam_sorted
- path normal_bam_sorted
- path pon
+ tuple val(chrom),
+ path(mutect_idx),
+ path(tumor_bam_sorted),
+ path(normal_bam_sorted),
+ path(pon)
Can you try re-running the command with this change (included in the 'Full Script' below)?
```nf #!/usr/bin/env nextflow nextflow.enable.dsl=2 // Define the list of chromosomes chromosomes = (1..22).collect { it.toString() } + ['X'] // Create a channel emitting each chromosome chrom_channel = Channel.from(chromosomes) // Define the Mutect2 process process mutect2 { memory '80 G' publishDir "${params.outdir}/svc", mode: 'copy' input: // Deconstruct the tuple into separate inputs tuple val(chrom), path(mutect_idx), path(tumor_bam_sorted), path(normal_bam_sorted), path(pon) output: path("${params.id}_chr${chrom}_unfiltered.vcf") path("${params.id}_chr${chrom}_f1r2.tar.gz") path("${params.id}_chr${chrom}_unfiltered.vcf.stats") script: """ gatk Mutect2 \\ -R ${mutect_idx} \\ -I ${tumor_bam_sorted} \\ -I ${normal_bam_sorted} \\ --panel-of-normals ${pon} \\ -normal ${normal_bam_sorted.baseName} \\ -L ${chrom} \\ -O ${params.id}_chr${chrom}_unfiltered.vcf \\ --f1r2-tar-gz ${params.id}_chr${chrom}_f1r2.tar.gz \\ -stats ${params.id}_chr${chrom}_unfiltered.vcf.stats """ } // Run the process for each chromosome workflow { chrom_channel .map { chrom -> tuple(chrom, params.mutect_idx, params.tumor_bam_sorted, params.normal_bam_sorted, params.pon) } | mutect2 } ```
Thanks for your help @lbeckman314 ! Seems to have worked to get over the initial hurdle; new one has arisen.
-stats is not a recognized option
removed -stats from Mutect2 script command and reran; same error encountered.
Initial error seems to be bypassed but new one encountered (below).
(nextflow) [goldmael@exanode-09-8 data_files]$ nextflow run new_test6.nf -params-file params.json
N E X T F L O W ~ version 23.10.1
Launching new_test6.nf [fervent_volta] DSL2 - revision: 121c2d58d7
executor > local (1)
[b9/74de23] process > mutect2 (2) [ 0%] 0 of 23
ERROR ~ Error executing process > 'mutect2 (2)'
Caused by:
Process `mutect2 (2)` terminated with an error exit status (1)
Command executed:
gatk Mutect2 \
-R GRCh38.d1.vd1.fa \
-I DNX230201PS_T_H_51768_B1R1_S27_aligned_sorted_markdup_rg_fixed_rg_fixed_sorted.bam \
-I DNX230201PS_G_H_51768_B1R1_S5_aligned_sorted_markdup_rg_fixed_rg_fixed_sorted.bam \
--panel-of-normals gatk4_mutect2_4136_pon.vcf \
-normal DNX230201PS_G_H_51768_B1R1_S5_aligned_sorted_markdup_rg_fixed_rg_fixed_sorted \
-L 2 \
-O H_51768_chr2_unfiltered.vcf \
--f1r2-tar-gz H_51768_chr2_f1r2.tar.gz \
-stats H_51768_chr2_unfiltered.vcf.stats
Command exit status:
1
Command output:
(empty)
Command error:
Valid only if "ReadLengthReadFilter" is specified:
--max-read-length <Integer> Keep only reads with length at most equal to the specified value Default value:
2147483647.
--min-read-length <Integer> Keep only reads with length at least equal to the specified value Default value: 30.
Valid only if "ReadNameReadFilter" is specified:
--read-name <String> Keep only reads with this read name Required.
Valid only if "ReadStrandFilter" is specified:
--keep-reverse-strand-only <Boolean>
Keep only reads on the reverse strand Required. Possible values: {true, false}
executor > local (2)
[21/f03435] process > mutect2 (1) [ 4%] 1 of 22, failed: 1
ERROR ~ Error executing process > 'mutect2 (2)'
Caused by:
Process `mutect2 (2)` terminated with an error exit status (1)
Command executed:
gatk Mutect2 \ -R GRCh38.d1.vd1.fa \
-I DNX230201PS_T_H_51768_B1R1_S27_aligned_sorted_markdup_rg_fixed_rg_fixed_sorted.bam \ -I DNX230201PS_G_H_51768_B1R1_S5_aligned_sorted_markdup_rg_fixed_rg_fixed_sorted.bam \
--panel-of-normals gatk4_mutect2_4136_pon.vcf \
-normal DNX230201PS_G_H_51768_B1R1_S5_aligned_sorted_markdup_rg_fixed_rg_fixed_sorted \
-L 2 \
-O H_51768_chr2_unfiltered.vcf \ --f1r2-tar-gz H_51768_chr2_f1r2.tar.gz \
-stats H_51768_chr2_unfiltered.vcf.stats
Command exit status:
1
Command output:
(empty)
Command error:
Valid only if "ReadLengthReadFilter" is specified: --max-read-length <Integer> Keep only reads with length at most equal to the specified value Default value: 2147483647.
--min-read-length <Integer> Keep only reads with length at least equal to the specified value Default value: 30.
Valid only if "ReadNameReadFilter" is specified:
--read-name <String> Keep only reads with this read name Required.
Valid only if "ReadStrandFilter" is specified:
--keep-reverse-strand-only <Boolean>
Keep only reads on the reverse strand Required. Possible values: {true, false}
Valid only if "ReadTagValueFilter" is specified:
--read-filter-tag <String> Look for this tag in read Required.
--read-filter-tag-comp <Float>Compare value in tag to this value Default value: 0.0.
--read-filter-tag-op <Operator>
Compare value in tag to value with this operator. If T is the value in the tag, OP is the
operation provided, and V is the value in read-filter-tag, then the read will pass the filter iff T OP V is true. Default value: EQUAL. Possible values: {LESS, LESS_OR_EQUAL,
GREATER, GREATER_OR_EQUAL, EQUAL, NOT_EQUAL}
Valid only if "SampleReadFilter" is specified:
--sample <String> The name of the sample(s) to keep, filtering out all others This argument must be
specified at least once. Required.
Valid only if "SoftClippedReadFilter" is specified:
--invert-soft-clip-ratio-filter <Boolean>
Inverts the results from this filter, causing all variants that would pass to fail and
visa-versa. Default value: false. Possible values: {true, false}
--soft-clipped-leading-trailing-ratio <Double>
Threshold ratio of soft clipped bases (leading / trailing the cigar string) to total bases
in read for read to be filtered. Default value: null. Cannot be used in conjunction with
argument(s) minimumSoftClippedRatio
--soft-clipped-ratio-threshold <Double>
Threshold ratio of soft clipped bases (anywhere in the cigar string) to total bases in
read for read to be filtered. Default value: null. Cannot be used in conjunction with
argument(s) minimumLeadingTrailingSoftClippedRatio
***********************************************************************
A USER ERROR has occurred: -stats is not a recognized option
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
Work dir:
/home/groups/CEDAR/goldmael/projects/wgs_test_files/data_files/work/b9/74de23995e5b914a1cb6096d33da6d
Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`
-- Check '.nextflow.log' file for details
nextflow
):(nextflow) [goldmael@exanode-09-8 data_files]$ gatk -version
Using GATK jar /home/users/goldmael/miniconda3/envs/nextflow/share/gatk4-4.4.0.0-0/gatk-package-4.4.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/users/goldmael/miniconda3/envs/nextflow/share/gatk4-4.4.0.0-0/gatk-package-4.4.0.0-local.jar -version
The Genome Analysis Toolkit (GATK) v4.4.0.0
HTSJDK Version: 3.0.5
Picard Version: 3.0.0
(nextflow) [goldmael@exanode-09-8 data_files]$ conda info
active environment : nextflow
active env location : /home/users/goldmael/miniconda3/envs/nextflow
shell level : 1
user config file : /home/users/goldmael/.condarc
populated config files : /home/users/goldmael/.condarc
conda version : 23.10.0
conda-build version : not installed
python version : 3.11.6.final.0
virtual packages : __archspec=1=broadwell
__glibc=2.17=0
__linux=3.10.0=0
__unix=0=0
base environment : /home/users/goldmael/miniconda3 (writable)
conda av data dir : /home/users/goldmael/miniconda3/etc/conda
conda av metadata url : None
channel URLs : https://conda.anaconda.org/conda-forge/linux-64
https://conda.anaconda.org/conda-forge/noarch
https://conda.anaconda.org/bioconda/linux-64
https://conda.anaconda.org/bioconda/noarch
https://conda.anaconda.org/r/linux-64
https://conda.anaconda.org/r/noarch
package cache : /home/users/goldmael/miniconda3/pkgs
/home/users/goldmael/.conda/pkgs
envs directories : /home/users/goldmael/miniconda3/envs
/home/users/goldmael/.conda/envs
platform : linux-64
user-agent : conda/23.10.0 requests/2.31.0 CPython/3.11.6 Linux/3.10.0-1160.66.1.el7.x86_64 centos/7.9.2009 glibc/2.17 solver/libmamba conda-libmamba-solver/23.11.0 libmambapy/1.5.3
UID:GID : 4733:3010
netrc file : None
offline mode : False
-stats
in Mutect2 call to see if it would be automatically output; produces same error --stats
; same errorUpdate:
- Can't get around same error using either of the attempts detailed below
Tried:
- Removing
-stats
in Mutect2 call to see if it would be automatically output; produces same error- Using
--stats
; same error
I ran into this too! --stats is only a parameter for filtermutectcalls, not mutect2, I think it was committed to that script in error and should only have been in filtermutectcalls. I remember fixing this error during my run through; the mutect2.nf script in the main branch is currently fixed
Problem:
Problem in the way I'm using input and/or output channels to pass multiple chromosomes to Mutect2 for per-chromosome processing; leads to
Error: Process mutect2 declares 5 input channels but 1 were specified
. Suspect both the channel and workflow setup have errors; there are several ways to set up multiple dynamically-named output files, but none I have tried so far have succeeded. Plan is to merge the 23 files with bcftools and GATK4 commands subsequently (those are largely figured out). See the following for output file options: (https://www.nextflow.io/docs/latest/process.html#multiple-output-files)Desired Behavior:
Split mutect2 processing by chromosome for speed, and thus output the 3 output file types by chromosome
Desired Output:
Output 23 files for each of three file types (unfiltered.vcf, f1r2.tar.gz, stats).
sample1_chr1.unfiltered.vcf,...., sample1_chrX.unfiltered.vcf
)sample1_chr1.f1r2.tar.gz
...)sample1_chr1.stats
...)Nextflow process and subworkflow (pasted together for issue but modularized in repo)
params-file
Error log: