nf-core / rnavar

gatk4 RNA variant calling pipeline
https://nf-co.re/rnavar
MIT License
34 stars 31 forks source link

SplitNCigarReads: sample cross-contamination #119

Open dbatrakou opened 8 months ago

dbatrakou commented 8 months ago

Description of the bug

This might be a nextflow bug. When looking at the workflow error to do with sample name collision (similar to #108), I've tracked down the issue to this .command.sh:

#!/bin/bash -euo pipefail
gatk --java-options "-Xmx36g" SplitNCigarReads \
    --input PS036.markdup.sorted.bam \
    --output PS044.bam \
    --reference Homo_sapiens.GRCh38.dna.primary_assembly.fa \
    --intervals 3scattered.interval_list \
    --tmp-dir . \
    --create-output-bam-index false

cat <<-END_VERSIONS > versions.yml
"NFCORE_RNAVAR:RNAVAR:SPLITNCIGAR:GATK4_SPLITNCIGARREADS":
    gatk4: $(echo $(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*$//')
END_VERSIONS

Either the --output parameter should be PS036_3scattered.bam or the --input parameter should be PS044.markdup.sorted.bam and the --output should then be PS044_3scattered.bam Nextflow had this under the PS044 sample, so likely the latter case (wrong staging?).

Command used and terminal output

nextflow run -resume nf-core/rnavar -r 1.0.0 -profile docker,dt02 --input /data/rnaseq/tissues/batch2/tissues_batch2.csv --fasta /data/reference/genomes/ensembl/release-110/Homo_sapiens.GRCh38.dna.primary_assembly.fa --gtf /data/reference/genomes/ensembl/release-110/Homo_sapiens.GRCh38.110.gtf --known_indels /data/reference/Homo_sapiens_assembly38.known_indels.vcf.gz --known_indels_tbi /data/reference/Homo_sapiens_assembly38.known_indels.vcf.gz.tbi --skip_variantannotation --star_index /data/reference/genomes/ensembl/release-110/STAR_sjdb149

Relevant files

tissues_batch2.csv

System information

Nextflow version 23.10.1 build 5891 (conda) Local executor on Desktop Ubuntu 22.04.3, rnavar v 1.0.0, docker containers