nf-core / sarek

Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
https://nf-co.re/sarek
MIT License
390 stars 401 forks source link

Mutect2 error with Sarek 3.4.0 #1369

Open mimifp opened 8 months ago

mimifp commented 8 months ago

Description of the bug

After 5h56min of runtime, Sarek's pipeline gave me an error telling me that it cannot find typical linux commands such as sed and cat. I checked that those commands were installed on the HPC where I am launching the tool and they are. Also, Mutect2 with Sarek worked for me for another sample using the same command line.

For the same sample had given me a different error before, I tried to solve it by increasing the RAM memory (although the error did not say anything about that, but a problem writing a file) and now I got the error I am talking about.

Any ideas?

Command used and terminal output

Command:

#!/bin/bash
#SBATCH -c 16
#SBATCH -t 08:00:00
#SBATCH --mem=40G
#SBATCH --mail-type=begin,end,fail
#SBATCH --mail-user=xxx@xxx.xxx

# $1 sample ID [e.g. P01M2]
# sbatch -J P01M2_mutect2 launch_mutect2.sh P01M2

module load cesga/2020 nextflow/23.04.2
NXF_SINGULARITY_CACHEDIR=/mnt/lustre/scratch/nlsas/home/usc/mg/translational_oncology/5_tmp/singularity_cache
NXF_WORK=/mnt/lustre/scratch/nlsas/home/usc/mg/translational_oncology/5_tmp/nextflow_cache

outdir=/mnt/lustre/scratch/nlsas/home/usc/mg/translational_oncology/2_projects/7_ENDEVO/noalt
samplesheet=/mnt/lustre/scratch/nlsas/home/usc/mg/translational_oncology/2_projects/7_ENDEVO/2_data/samplesheets/samplesheet_$1.csv
reference=/mnt/lustre/scratch/nlsas/home/usc/mg/translational_oncology/0_reference

nextflow run nf-core/sarek -r 3.4.0 \
    -profile singularity \
    --step variant_calling \
    --max_cpus 16 \
    --wes \
    --input $samplesheet \
    --fasta ${reference}/2_GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \
    --fasta_fai ${reference}/2_GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.fai \
    --dict ${reference}/2_GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.dict \
    --intervals ${reference}/2_GRCh38/SureSelect_v6cosmic_macrogen_short.bed  \
    --tools mutect2 \
    --known_snp ${reference}/4_others/dbsnp_146.hg38.vcf.gz \
    --known_snp ${reference}/4_others/dbsnp_146.hg38.vcf.gz.tbi \
    --outdir $outdir \
    --email xxx@xxx.xxx

Output:
```bash
nf-core/sarek execution completed unsuccessfully!
The exit status of the task that caused the workflow execution to fail was: 1.

The full error message was:

Error executing process > 'NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_SOMATIC_ALL:BAM_VARIANT_CALLING_SOMATIC_MUTECT2:GETPILEUPSUMMARIES_TUMOR (P01B2R1)'

Caused by:
  Process `NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_SOMATIC_ALL:BAM_VARIANT_CALLING_SOMATIC_MUTECT2:GETPILEUPSUMMARIES_TUMOR (P01B2R1)` terminated with an error exit status (1)

Command executed:

  gatk --java-options "-Xmx9830M -XX:-UsePerfData" \
      GetPileupSummaries \
      --input P01B2R1.converted.cram \
      --variant af-only-gnomad.hg38.vcf.gz \
      --output P01B2R1.mutect2.pileups.table \
      --reference GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \
      --intervals chr1_12081-12251.bed \
      --tmp-dir . \

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_SOMATIC_ALL:BAM_VARIANT_CALLING_SOMATIC_MUTECT2:GETPILEUPSUMMARIES_TUMOR":
      gatk4: $(echo $(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*$//')
  END_VERSIONS

Command exit status:
  1

Command output:
  [0.001s][warning][os,container] Duplicate cpuset controllers detected. Picking /sys/fs/cgroup/cpuset, skipping /mnt/netapp1/Optcesga_FT2_RHEL7/2020/gentoo/22072020/var/singularity/mnt/session/final/sys/fs/cgroup/cpuset.
  Tool returned:
  SUCCESS

Command error:
  23:50:48.892 INFO  ProgressMeter -       chr18:54941098             26.4              53954000        2040779.8
  23:50:58.903 INFO  ProgressMeter -       chr18:79710373             26.6              54207000        2037490.8
  23:51:08.907 INFO  ProgressMeter -        chr19:4267861             26.8              54608000        2039779.8
  23:51:18.965 INFO  ProgressMeter -       chr19:10004118             26.9              55099000        2045313.2
  23:51:29.020 INFO  ProgressMeter -       chr19:14941611             27.1              55535000        2048754.2
  23:51:39.076 INFO  ProgressMeter -       chr19:20646443             27.3              55979000        2052442.5
  23:51:49.090 INFO  ProgressMeter -       chr19:34445005             27.4              56148000        2046117.9
  23:51:59.093 INFO  ProgressMeter -       chr19:40577025             27.6              56661000        2052343.6
  23:52:09.133 INFO  ProgressMeter -       chr19:46675279             27.8              57129000        2056828.7
  23:52:19.173 INFO  ProgressMeter -       chr19:52034473             27.9              57650000        2063156.8
  23:52:29.178 INFO  ProgressMeter -       chr19:57491540             28.1              58072000        2065931.8
  23:52:39.226 INFO  ProgressMeter -       chr20:11923047             28.3              58472000        2067841.2
  23:52:49.268 INFO  ProgressMeter -       chr20:33367038             28.4              58795000        2067029.5
  23:52:59.273 INFO  ProgressMeter -       chr20:45802029             28.6              59229000        2070151.5
  23:53:09.288 INFO  ProgressMeter -       chr20:62546277             28.8              59632000        2072148.0
  23:53:19.300 INFO  ProgressMeter -       chr21:29555308             28.9              59893000        2069219.3
  23:53:29.322 INFO  ProgressMeter -       chr21:43985141             29.1              60248000        2069541.2
  23:53:39.325 INFO  ProgressMeter -       chr22:20564354             29.3              60597000        2069676.9
  23:53:49.335 INFO  ProgressMeter -       chr22:31262697             29.4              61059000        2073640.5
  23:53:59.345 INFO  ProgressMeter -       chr22:41758344             29.6              61499000        2076816.4
  23:54:09.359 INFO  ProgressMeter -         chrX:3320041             29.8              61892000        2078373.9
  23:54:19.384 INFO  ProgressMeter -        chrX:24060153             29.9              62192000        2076795.7
  23:54:29.416 INFO  ProgressMeter -        chrX:44969044             30.1              62474000        2074629.2
  23:54:39.466 INFO  ProgressMeter -        chrX:53984790             30.3              62903000        2077320.6
  23:54:49.498 INFO  ProgressMeter -        chrX:77584251             30.4              63286000        2078492.2
  23:54:59.500 INFO  ProgressMeter -       chrX:103077681             30.6              63591000        2077137.2
  23:55:09.503 INFO  ProgressMeter -       chrX:123888804             30.8              63909000        2076218.0
  23:55:19.507 INFO  ProgressMeter -       chrX:150771813             30.9              64212000        2074822.9
  23:55:25.217 INFO  GetPileupSummaries - 0 read(s) filtered by: MappingQualityAvailableReadFilter 
  5145854 read(s) filtered by: MappingQualityNotZeroReadFilter 
  0 read(s) filtered by: MappedReadFilter 
  3879470 read(s) filtered by: PrimaryLineReadFilter 
  98716086 read(s) filtered by: NotDuplicateReadFilter 
  0 read(s) filtered by: PassesVendorQualityCheckReadFilter 
  0 read(s) filtered by: NonZeroReferenceLengthAlignmentReadFilter 
  495025 read(s) filtered by: MateOnSameContigOrNoMappedMateReadFilter 
  0 read(s) filtered by: GoodCigarReadFilter 
  0 read(s) filtered by: WellformedReadFilter 
  108236435 total reads filtered out of 148946276 reads processed
  23:55:25.217 INFO  ProgressMeter -        chrY:19732575             31.0              64456547        2076339.9
  23:55:25.217 INFO  ProgressMeter - Traversal complete. Processed 64456547 total loci in 31.0 minutes.
  23:55:25.218 INFO  GetPileupSummaries - Shutting down engine
  [January 2, 2024 at 11:55:25 PM GMT] org.broadinstitute.hellbender.tools.walkers.contamination.GetPileupSummaries done. Elapsed time: 31.08 minutes.
  Runtime.totalMemory()=1048576000
  Tool returned:
  SUCCESS
  .command.sh: line 12: sed: command not found
  .command.sh: line 12: cat: command not found
  .command.run: line 155: kill: (33) - No such process
  INFO:    Cleaning up image...

Work dir:
  /mnt/lustre/scratch/nlsas/home/usc/mg/translational_oncology/2_projects/7_ENDEVO/1_src/work/83/a77fbdf266bed78ccb2bd97c2957d2

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`


### Relevant files

[nextflow.log](https://github.com/nf-core/sarek/files/13817457/nextflow.log)

### System information

- Nextflow version - 23.04.2
- Hardware - HPC
- Executor - slurm
- Container engine - Singularity
- OS - Linux
- Version os nf-core/Sarek - 3.4.0
asp8200 commented 8 months ago

As the failing job should be running in singularity, the presence of sed and cat on your HPC should be irrelevant. I suspect that there was some glitch on your HPC, and I would try to re-run the nextflow command with the option -resume added.

Also, I recommend that you join us on Slack for help with handling this kind of error.

https://nfcore.slack.com/channels/sarek

Cheers

mimifp commented 8 months ago

Ok, I'll try what you tell me.

Thanks!

FriederikeHanssen commented 8 months ago

hi @mimifp did the rerun work?

mimifp commented 8 months ago

Hi @FriederikeHanssen and sorry for the delay in replying. The rerun did not work, I chose to run Mutect2 separately, with a pipeline that we have prepared. I wrote to the HPC administrators to investigate what might be going on, so far they have not found an answer.