Open HenrikBengtsson opened 3 years ago
I tried to replace the original muTect-1.0.27783.jar with muTect-1.1.4.jar. New muTect itself seems to be working, but getting an error downstream on filtering step FilterMutations/Filter.py, which calls MuTector:
Warning: MuTector version (## muTector v1.0.47986) not what we are expecting (## muTector v1.0.27200)...
MuTectorc olumns not the expected columns.
#Col Actual Expected
Traceback (most recent call last):
File "/c4/home/jocostello/repos/LG3_Pipeline/FilterMutations/Filter.py", line 254, in <module>
sys.exit(main())
File "/c4/home/jocostello/repos/LG3_Pipeline/FilterMutations/Filter.py", line 82, in main
filterPointMutations(pointMutFn, mutations)
File "/c4/home/jocostello/repos/LG3_Pipeline/FilterMutations/Filter.py", line 135, in filterPointMutations
numCols = validateMutectorFile(version, rawHeader.replace('\t', ' '))
File "/c4/home/jocostello/repos/LG3_Pipeline/FilterMutations/MuTector.py", line 165, in validateMutectorFile
for i in xrange(max(actualLen, NumMutectorColumns)):
NameError: global name 'NumMutectorColumns' is not defined
Thanks. Unfortunately, attempting to replace muTect v1.0.27783 with muTect v1.1.1 (sic!) also failed with the same error:
$ cat _MutDet_Z00601t10.out
Sourced: /cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/lg3.conf
Sourced: /cbc2/data2/henrik/repositories/UCSF-CostelloLab/test-next-release/lg3.conf (63 bytes)
[2021-10-05 19:03:30 PDT] BEGIN: /var/spool/torque/mom_priv/jobs/2064182.cclc01.som.ucsf.edu.SC
Call: /var/spool/torque/mom_priv/jobs/2064182.cclc01.som.ucsf.edu.SC
Script: /var/spool/torque/mom_priv/jobs/2064182.cclc01.som.ucsf.edu.SC
Arguments:
Settings:
- LG3_HOME=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release
- LG3_INPUT_ROOT=output
- LG3_OUTPUT_ROOT=output
- EMAIL=henrik.bengtsson-gmail@fwd.braju.com
- PROJECT=LG3
- LG3_SCRATCH_ROOT=/scratch/henrik/2064182.cclc01.som.ucsf.edu
- PWD=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/test-next-release
- USER=henrik
- PBS_NUM_PPN=4
- hostname=n27
Input:
- PATIENT=Patient157t10
- TUMOR=Z00601t10
- NORMAL=Z00599t10
- TYPE=REC1
- CONFIG=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/FilterMutations/mutationConfig.cfg
- INTERVAL=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/resources/All_exome_targets.extended_200bp.interval_list
- XMX=Xmx8g
- WORKDIR=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/test-next-release/output/LG3/mutations/Patient157t10_mutect
New working directory: '/scratch/henrik/2064182.cclc01.som.ucsf.edu/Patient157t10_mutect' (was '/cbc2/data2/henrik/repositories/UCSF-CostelloLab/test-next-release')
Starting MutDet job on Tue Oct 5 19:03:30 PDT 2021
Patient = Patient157t10
Normal = /cbc2/data2/henrik/repositories/UCSF-CostelloLab/test-next-release/output/LG3/exomes_recal/Patient157t10/Z00599t10.bwa.realigned.rmDups.recal.bam
Tumor = /cbc2/data2/henrik/repositories/UCSF-CostelloLab/test-next-release/output/LG3/exomes_recal/Patient157t10/Z00601t10.bwa.realigned.rmDups.recal.bam
Tum. Type = REC1
Config = /cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/FilterMutations/mutationConfig.cfg
Interval = /home/jocostello/shared/LG3_Pipeline_HIDE/resources/All_exome_targets.extended_200bp.interval_list
Java Memory = Xmx8g
WORKDIR=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/test-next-release/output/LG3/mutations/Patient157t10_mutect
SCRATCH=/scratch/henrik/2064182.cclc01.som.ucsf.edu/Patient157t10_mutect
Sourced: /cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/lg3.conf
[2021-10-05 19:03:30 PDT] BEGIN: /cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/scripts/MutDet.sh
Call: /cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/scripts/MutDet.sh
Script: /cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/scripts/MutDet.sh
Arguments: /cbc2/data2/henrik/repositories/UCSF-CostelloLab/test-next-release/output/LG3/exomes_recal/Patient157t10/Z00599t10.bwa.realigned.rmDups.recal.bam /cbc2/data2/henrik/repositories/UCSF-CostelloLab/test-next-release/output/LG3/exomes_recal/Patient157t10/Z00601t10.bwa.realigned.rmDups.recal.bam NOR-Z00599t10__REC1-Z00601t10 Patient157t10 /cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/FilterMutations/mutationConfig.cfg /home/jocostello/shared/LG3_Pipeline_HIDE/resources/All_exome_targets.extended_200bp.interval_list Xmx8g
Settings:
- LG3_HOME=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release
- LG3_OUTPUT_ROOT=output
- LG3_SCRATCH_ROOT=/scratch/henrik/2064182.cclc01.som.ucsf.edu
- PWD=/scratch/henrik/2064182.cclc01.som.ucsf.edu/Patient157t10_mutect
- USER=henrik
- PBS_NUM_PPN=4
- hostname=n27
- ncores=4
Input:
- nbamfile=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/test-next-release/output/LG3/exomes_recal/Patient157t10/Z00599t10.bwa.realigned.rmDups.recal.bam
- tbamfile=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/test-next-release/output/LG3/exomes_recal/Patient157t10/Z00601t10.bwa.realigned.rmDups.recal.bam
- prefix=NOR-Z00599t10__REC1-Z00601t10
- patientID=Patient157t10
- CONFIG=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/FilterMutations/mutationConfig.cfg
- ILIST=/home/jocostello/shared/LG3_Pipeline_HIDE/resources/All_exome_targets.extended_200bp.interval_list
- XMX=Xmx8g
Software:
- JAVA=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/tools/java/jre1.6.0_27/bin/java
- PYTHON=/usr/bin/python
- MUTECT=/home/shared/cbc/software_cbc/mutect-1.1.1/muTect-1.1.1.jar
- FILTER=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/FilterMutations/Filter.py
- REORDER=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/scripts/vcf_reorder.py
References:
- REF=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/resources/UCSC_HG19_Feb_2009/hg19.fa
- DBSNP=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/resources/dbsnp_132.hg19.sorted.vcf
- REORDER=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/scripts/vcf_reorder.py
- CONVERT=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/resources/RefSeq.Entrez.txt
- KINASEDATA=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/resources/all_human_kinases.txt
- COSMICDATA=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/resources/CosmicMutantExport_v58_150312.tsv
- CANCERDATA=/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/resources/SangerCancerGeneCensus_2012-03-15.txt
-------------------------------------------------
[MutDet] Mutation detection Tue Oct 5 19:03:30 PDT 2021
-------------------------------------------------
[MutDet] Patient ID: Patient157t10
[MutDet] Normal Sample: Z00599t10
[MutDet] Tumor Sample: Z00601t10
[MutDet] Prefix: NOR-Z00599t10__REC1-Z00601t10
-------------------------------------------------
[MutDet] Normal bam file: /cbc2/data2/henrik/repositories/UCSF-CostelloLab/test-next-release/output/LG3/exomes_recal/Patient157t10/Z00599t10.bwa.realigned.rmDups.recal.bam
[MutDet] Tumor bam file: /cbc2/data2/henrik/repositories/UCSF-CostelloLab/test-next-release/output/LG3/exomes_recal/Patient157t10/Z00601t10.bwa.realigned.rmDups.recal.bam
[MutDet] Java Memory Xmx value: Xmx8g
[MutDet] Working directory: /scratch/henrik/2064182.cclc01.som.ucsf.edu/Patient157t10_mutect
-------------------------------------------------
[MutDet] Running muTect...
WARN 19:39:16,450 RestStorageService - Error Response: PUT '/GATK_Run_Reports/lDhttZCwmz5Ruf4shIpKQIfUdQ8uEHja.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 419, Content-MD5: rnYUhg6epuz4I4rUF7Wj1w==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: ae7614860e9ea6ecf8238ad417b5a3d7, Date: Wed, 06 Oct 2021 02:39:15 GMT, Authorization: AWS AKIAJXU7VIHBPDW4TDSQ:ivz9hbrp8Dt2DTF4b1oUiuR1j0I=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-504.12.2.el6.664g0000.x86_64; amd64; en; JVM 1.6.0_27), Host: s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: 8AS8NRNHAEQGE5ZY, x-amz-id-2: 1Vf6WrvfAYoGn/p0GPVxqVSrNF38/m4IPGH1b7Gw2MJmG1qbliI2TohWu1gYkSlwXEwVhRXPu7A=, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Wed, 06 Oct 2021 02:39:15 GMT, Server: AmazonS3, Connection: close]
real 35m45.976s
user 140m35.359s
sys 1m52.980s
Done
15200 NOR-Z00599t10__REC1-Z00601t10.snvs.raw.mutect.txt
[MutDet] Running Somatic Indel Detector...
INFO 19:39:18,393 HelpFormatter - --------------------------------------------------------------------------------
INFO 19:39:18,395 HelpFormatter - The Genome Analysis Toolkit (GATK) v1.6-5-g557da77, Compiled 2012/05/03 17:30:26
INFO 19:39:18,395 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 19:39:18,396 HelpFormatter - Please view our documentation at http://www.broadinstitute.org/gsa/wiki
INFO 19:39:18,396 HelpFormatter - For support, please view our support site at http://getsatisfaction.com/gsa
INFO 19:39:18,396 HelpFormatter - Program Args: --analysis_type SomaticIndelDetector -I:normal /cbc2/data2/henrik/repositories/UCSF-CostelloLab/test-next-release/output/LG3/exomes_recal/Patient157t10/Z00599t10.bwa.realigned.rmDups.recal.bam -I:tumor /cbc2/data2/henrik/repositories/UCSF-CostelloLab/test-next-release/output/LG3/exomes_recal/Patient157t10/Z00601t10.bwa.realigned.rmDups.recal.bam --logging_level INFO --reference_sequence /cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/resources/UCSC_HG19_Feb_2009/hg19.fa --intervals /home/jocostello/shared/LG3_Pipeline_HIDE/resources/All_exome_targets.extended_200bp.interval_list -baq CALCULATE_AS_NECESSARY --maxNumberOfReads 10000 --window_size 350 --filter_expressions N_COV<8||T_COV<14||T_INDEL_F<0.1||T_INDEL_CF<0.7 --out NOR-Z00599t10__REC1-Z00601t10.indels.raw.vcf
INFO 19:39:18,397 HelpFormatter - Date/Time: 2021/10/05 19:39:18
INFO 19:39:18,397 HelpFormatter - --------------------------------------------------------------------------------
INFO 19:39:18,397 HelpFormatter - --------------------------------------------------------------------------------
INFO 19:39:18,418 GenomeAnalysisEngine - Strictness is SILENT
INFO 19:39:18,471 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 19:39:18,495 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02
INFO 19:39:19,809 SomaticIndelDetectorWalker - No gene annotations available
INFO 19:39:23,893 TraversalEngine - [INITIALIZATION COMPLETE; TRAVERSAL STARTING]
INFO 19:39:23,893 TraversalEngine - Location processed.reads runtime per.1M.reads completed total.runtime remaining
INFO 19:40:40,874 TraversalEngine - chr17:97210 1.00e+03 80.2 s 22.3 h 78.2% 102.6 s 22.4 s
INFO 19:41:10,934 TraversalEngine - chr17:7839985 1.81e+05 110.3 s 10.2 m 79.0% 2.3 m 29.4 s
INFO 19:41:41,227 TraversalEngine - chr17:18855823 3.31e+05 2.3 m 7.1 m 79.6% 2.9 m 36.1 s
INFO 19:42:11,274 TraversalEngine - chr17:34106174 4.97e+05 2.8 m 5.7 m 80.3% 3.5 m 41.9 s
INFO 19:42:41,289 TraversalEngine - chr17:42154106 6.81e+05 3.3 m 4.9 m 81.2% 4.1 m 46.6 s
INFO 19:43:11,484 TraversalEngine - chr17:56233486 8.48e+05 3.8 m 4.5 m 81.9% 4.7 m 51.1 s
INFO 19:43:41,511 TraversalEngine - chr17:66925656 1.00e+06 4.3 m 4.3 m 82.5% 5.3 m 55.2 s
INFO 19:44:11,538 TraversalEngine - chr17:79655939 1.20e+06 4.8 m 4.0 m 83.4% 5.8 m 58.0 s
INFO 19:44:41,540 TraversalEngine - chr19:5668357 1.45e+06 5.3 m 3.7 m 85.7% 6.2 m 53.3 s
INFO 19:45:11,572 TraversalEngine - chr19:13925364 1.71e+06 5.8 m 3.4 m 86.7% 6.7 m 54.1 s
INFO 19:45:41,796 TraversalEngine - chr19:23578160 1.89e+06 6.4 m 3.4 m 87.4% 7.3 m 54.8 s
INFO 19:46:11,865 TraversalEngine - chr19:40886445 2.07e+06 6.9 m 3.3 m 88.2% 7.8 m 54.9 s
INFO 19:46:41,896 TraversalEngine - chr19:49006234 2.25e+06 7.4 m 3.3 m 89.0% 8.3 m 54.3 s
INFO 19:47:12,028 TraversalEngine - chr19:56249473 2.46e+06 7.9 m 3.2 m 89.9% 8.7 m 52.7 s
INFO 19:47:26,003 Walker - [REDUCE RESULT] Traversal result is: 2511833
INFO 19:47:26,003 TraversalEngine - Total runtime 485.36 secs, 8.09 min, 0.13 hours
INFO 19:47:26,055 TraversalEngine - 116043 reads were filtered out during traversal out of 2627951 total (4.42%)
INFO 19:47:26,055 TraversalEngine - -> 116043 reads (4.42% of total) failing MappingQualityZeroFilter
WARN 19:47:27,078 RestStorageService - Error Response: PUT '/GATK_Run_Reports/OghmfKohj3Xndyb8pmIl1LEIynZTUkJd.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 334, Content-MD5: j/moAkl4naH5ZEOaaXl5gQ==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: 8ff9a80249789da1f964439a69797981, Date: Wed, 06 Oct 2021 02:47:26 GMT, Authorization: AWS AKIAJXU7VIHBPDW4TDSQ:lc7M3ga8zwS/nPqL/6Egx2dTjKo=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-504.12.2.el6.664g0000.x86_64; amd64; en; JVM 1.6.0_27), Host: s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: A9DPSXZPEV4G3RN1, x-amz-id-2: nAE+0r6Xkyw1WsgNqRyc1AYIM+gxQ+OmSrC19b3vZq9CQyO0DdYzNAd5woqh2eX6c+FmDoUa+Js=, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Wed, 06 Oct 2021 02:47:26 GMT, Server: AmazonS3, Connection: close]
real 8m10.609s
user 9m33.513s
sys 0m7.670s
Done
667 NOR-Z00599t10__REC1-Z00601t10.indels.raw.vcf
[MutDet] Annotating raw indel calls...
INFO 19:47:28,968 RodBindingArgumentTypeDescriptor - Dynamically determined type of NOR-Z00599t10__REC1-Z00601t10.indels.raw.vcf to be VCF
WARN 19:47:36,850 RestStorageService - Error Response: PUT '/GATK_Run_Reports/mm8JLMSL491v4MQnd4t2xJ4G7UjxdCVl.report.xml.gz' -- ResponseCode: 403, ResponseStatus: Forbidden, Request Headers: [Content-Length: 323, Content-MD5: oOoD53mdt1wjLP0PA+bzMg==, Content-Type: application/octet-stream, x-amz-meta-md5-hash: a0ea03e7799db75c232cfd0f03e6f332, Date: Wed, 06 Oct 2021 02:47:36 GMT, Authorization: AWS AKIAJXU7VIHBPDW4TDSQ:UrricL2kbsxwy5JbnHBSN1fSow0=, User-Agent: JetS3t/0.8.1 (Linux/2.6.32-504.12.2.el6.664g0000.x86_64; amd64; en; JVM 1.6.0_27), Host: s3.amazonaws.com, Expect: 100-continue], Response Headers: [x-amz-request-id: 5DBJVMRAC93EV3BA, x-amz-id-2: PvQn/mNQk0/5+SjBtgC5uDA/IWs/wgeZOXBgJFuJ6Yi3rTvk4yA20i3OMuvlchxrnlQuLNoq8no=, Content-Type: application/xml, Transfer-Encoding: chunked, Date: Wed, 06 Oct 2021 02:47:36 GMT, Server: AmazonS3, Connection: close]
real 0m9.761s
user 0m17.581s
sys 0m2.042s
Done
684 NOR-Z00599t10__REC1-Z00601t10.indels.annotated.vcf
[MutDet] Reordering indel vcf...
real 0m0.321s
user 0m0.049s
sys 0m0.028s
Done
684 NOR-Z00599t10__REC1-Z00601t10.indels.annotated.temp.vcf
[MutDet] Filtering mutect and indel output...
Warning: MuTector version (## muTector v1.0.44829) not what we are expecting (## muTector v1.0.27200)...
MuTectorc olumns not the expected columns.
#Col Actual Expected
Traceback (most recent call last):
File "/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/FilterMutations/Filter.py", line 254, in <module>
sys.exit(main())
File "/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/FilterMutations/Filter.py", line 82, in main
filterPointMutations(pointMutFn, mutations)
File "/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/FilterMutations/Filter.py", line 135, in filterPointMutations
numCols = validateMutectorFile(version, rawHeader.replace('\t', ' '))
File "/cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/FilterMutations/MuTector.py", line 165, in validateMutectorFile
for i in xrange(max(actualLen, NumMutectorColumns)):
NameError: global name 'NumMutectorColumns' is not defined
real 0m0.416s
user 0m0.060s
sys 0m0.128s
ERROR: Filtering failed
Traceback:
1: main() on line #219 in /cbc2/data2/henrik/repositories/UCSF-CostelloLab/LG3_Pipeline-next-release/scripts/MutDet.sh
Exiting (exit 1)
ERROR: MutDet failed
Traceback:
1: main() on line #101 in /var/spool/torque/mom_priv/jobs/2064182.cclc01.som.ucsf.edu.SC
Exiting (exit 1)
[henrik@cclc01 ~/repositories/UCSF-CostelloLab/test-next-release]$
This will make it possible to get 100% reproducible runs by using CLI option
--disableRandomization
(https://github.com/UCSF-Costello-Lab/LG3_Pipeline/issues/141#issuecomment-634226939). With 100% reproducible runs, we can move forward and replacing other things. If we break something, our reproducibility tests will catch it.