UCSF-Costello-Lab / LG3_Pipeline

The original LG3 pipeline
https://github.com/UCSF-Costello-Lab/LG3_Pipeline
0 stars 0 forks source link

ERROR MESSAGE: Invalid argument value '-I ...' #7

Closed HenrikBengtsson closed 6 years ago

HenrikBengtsson commented 6 years ago
$ cat _Recal_Patient157.err
R scripting front-end version 3.4.2 (2017-09-28)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 1.6-5-g557da77): 
##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
##### ERROR Please do not post this error to the GATK forum
##### ERROR
##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments
.
##### ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki
##### ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa
##### ERROR
##### ERROR MESSAGE: Invalid argument value '-I /costellolab/data1/jocostello/LG3/exomes_recal/Patient15
7/Patient157.merged.bam -I /costellolab/data1/jocostello/LG3/exomes_recal/Patient157/Patient157.merged.r
ealigned.bam ' at position 6.
##### ERROR ------------------------------------------------------------------------------------------

$ cat _Recal_Patient157.out
Started  on 2018-08-29 21:27:46
Using node(s): n27 n27 n27 n27 n27 n27 n27 n27 n27 n27 n27 n27 
------------------------------------------------------
[Recal] Base quality recalibration (bigmem version)
Wed Aug 29 21:27:48 PDT 2018
------------------------------------------------------
[Recal] Recalibration Group: Patient157
[Recal] Exome:/costellolab/data1/jocostello/LG3/exomes/Z00599/Z00599.trim.bwa.sorted.bam
[Recal] Exome:/costellolab/data1/jocostello/LG3/exomes/Z00600/Z00600.trim.bwa.sorted.bam
[Recal] Exome:/costellolab/data1/jocostello/LG3/exomes/Z00601/Z00601.trim.bwa.sorted.bam
[...]
-------------------------------------------------
-I /costellolab/data1/jocostello/LG3/exomes_recal/Patient157/Patient157.merged.bam -I /costellolab/data1/jocostello/LG3/exomes_recal/Pat
ient157/Patient157.merged.realigned.bam 
[Germline] Running Unified Genotyper...
--------------------------------------------------------------------------------
[...]
Unified Genotyper SNP calling failed
End of script! Thu Aug 30 22:07:59 PDT 2018

which comes from one of these following files:

$ grep -F "Unified Genotyper SNP calling failed" scripts/*.sh
scripts/Germline.sh:        --out "${patientID}.UG.snps.raw.vcf" || { echo "Unified Genotyper SNP calling failed"; exit 1; }
scripts/MutDet.sh:      --out "${patientID}.${prefix}.UG.snps.raw.vcf" || { echo "Unified Genotyper SNP calling failed"; exit 1; }
scripts/Rob-MutDet-hg18.sh:     --out "${patientID}.${prefix}.UG.snps.raw.vcf" || { echo "Unified Genotyper SNP calling failed"; exit 1; }
HenrikBengtsson commented 6 years ago

This is most likely because of https://github.com/UCSF-Costello-Lab/LG3_Pipeline/blob/38405cf4b8352aa82c1e87c727cf3d412c7eb113/scripts/Germline.sh#L36-L51:

INPUTS=$(for i in ${bamdir}/*.bam
do
    echo -n "-I $i "
done)
echo "$INPUTS"

    ### $JAVA -Xmx16g \
    ### -nct 3 -nt 8 \
if [ ! -e "${patientID}.UG.snps.raw.vcf" ]; then
    echo "[Germline] Running Unified Genotyper..."
    $JAVA -Xmx64g \
        -jar $GATK \
        --analysis_type UnifiedGenotyper \
        --genotype_likelihoods_model SNP \
        --genotyping_mode DISCOVERY \
        "$INPUTS" \  <===
HenrikBengtsson commented 6 years ago

I've fixed scripts/Germline.sh and scripts/UG.sh in the develop branch.

@ivan108, please do git pull and relaunch tests.

HenrikBengtsson commented 6 years ago

Found yet more cases of this from manually inspection of the remaining scripts/*.sh files. Do another git pull.

ivan108 commented 6 years ago

Great, thanks for fixing this so quick. I guess I introduced that bug when I quoted "INPUTS" ...

However, this is not the only problem we have. Before that Recal pipeline (scripts/Recal_bigmem.sh) quietly failed on line 214, right after "Indel realignment" step but before deleting intermediate files (rm -f "${patientID}.merged.bam") ...

I also noticed "${inputs}" on line 46 of scripts/Recal_bigmem.sh, it seems we don't need them, but shellcheck insists on that?

ivan108 commented 6 years ago

... we don't need quotes?

HenrikBengtsson commented 6 years ago

I did remove those quotes in scripts/Recal_bigmem.sh as well, cf. commit 16c4c7d5.

... we don't need quotes?

We should not use quotes around ${inputs} and $INPUTS, but ideally we should probably quote each individual -I <file> as -I "<file>" when we construct those two variables/strings. I ignored that for now, but we could add it do our todo list.

HenrikBengtsson commented 6 years ago

Moved the remaining part of this to Issue #9.