dmiller903 / trioPhaser

An application to phase a trio when gVCF's are available
MIT License
8 stars 1 forks source link

/tmp/combined.vcf.gz and /tmp/genotyped.vcf.gz are missing? #1

Closed rupabose closed 2 years ago

rupabose commented 2 years ago

I've tried running triophaser for the example data and for my own. In all instances, I get the same error and traceback:

`A USER ERROR has occurred: Couldn't read file file:///tmp/combined.vcf.gz. Error was: It doesn't exist.


Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace. Trio has been join-genotyped. Time elapsed: 3.52 minutes. Traceback (most recent call last): File "/trio_phaser.py", line 346, in with gzip.open(temp_genotyped_name, "rt") as vcf: File "/usr/lib/python3.6/gzip.py", line 53, in open binary_file = GzipFile(filename, gz_mode, compresslevel) File "/usr/lib/python3.6/gzip.py", line 163, in init fileobj = self.myfileobj = builtins.open(filename, mode or 'rb') FileNotFoundError: [Errno 2] No such file or directory: '/tmp/genotyped.vcf.gz'`

I've tried pulling again to make sure I have the latest version of the code, which I do; the issue remains the same.

dmiller903 commented 2 years ago

Can you please share all the information that is in the log? Also, can you share your execution code? The genotyped file is not being output, so there is an issue that is occurring at some point before that. My first suspicion of what may be going on is perhaps memory related. How much memory does your machine have? How much memory is your Docker engine allowed to use? If Docker's allowed memory use is set less than 4GB of memory, that'll most likely cause an issue during the joint-genotyping step as 4GB is required. I think Docker defaults to like 2GB of memory.

rupabose commented 2 years ago

Thanks! There is no limit on Docker on my machine right now, so I don't think that's the issue. I think it might be having trouble creating the tmp/genotyped.vcf.gz file to start with.

Here is what I was running:

sudo docker run -v /home/rupa/phasing_truth_sets_data:/proj -w /proj -t dmill903/triophaser:latest python3 /trio_phaser.py \
        -r /proj/haplotype_references \
        -t /proj/triophaserfile.tsv

And here is the entire log:

Archive:  /fasta_references.zip
  inflating: /fasta_references/readme  
  inflating: /fasta_references/convert_reference.py  
  inflating: /fasta_references/reference_table.tsv  
 extracting: /fasta_references/human_g1k_v37_modified.fasta.gz  
  inflating: /fasta_references/human_g1k_v37_modified.dict  
  inflating: /fasta_references/human_g1k_v37_modified.fasta.fai  
 extracting: /fasta_references/Homo_sapiens_assembly38.dict.gz  
 extracting: /fasta_references/Homo_sapiens_assembly38.fasta.fai.gz  
 extracting: /fasta_references/Homo_sapiens_assembly38.fasta.gz  
 extracting: /fasta_references/hg38ToHg19.over.chain.gz  
Variant-only positions of the child have been written to a temporary file. Time elapsed: 0.29 minutes
Positions in HG00404.vcf.gz that correspond to variant-only positions of child have been output to temporary file.
Positions in HG00403.vcf.gz that correspond to variant-only positions of child have been output to temporary file.
Positions of each parent that correspond to variant-only positions of child have been output to temporary file. Time elapsed: 0.27 minutes.
/root/miniconda3/bin//gatk:80: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 0 or (len(args) is 1 and (args[0] == "--help" or args[0] == "-h")):
/root/miniconda3/bin//gatk:80: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 0 or (len(args) is 1 and (args[0] == "--help" or args[0] == "-h")):
/root/miniconda3/bin//gatk:117: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 1 and args[0] == "--list":
/root/miniconda3/bin//gatk:301: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if call(["gsutil", "-q", "stat", gcsjar]) is 0:
/root/miniconda3/bin//gatk:305: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if call(["gsutil", "cp", jar, gcsjar]) is 0:
/root/miniconda3/bin//gatk:458: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if not len(properties) is 0:
/root/miniconda3/bin//gatk:462: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if not len(filesToAdd) is 0:
Using GATK jar /root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar IndexFeatureFile -F /tmp/child_parsed.vcf.gz
21:58:05.131 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
21:58:05.217 INFO  IndexFeatureFile - ------------------------------------------------------------
21:58:05.217 INFO  IndexFeatureFile - The Genome Analysis Toolkit (GATK) v4.0.5.1
21:58:05.217 INFO  IndexFeatureFile - For support and documentation go to https://software.broadinstitute.org/gatk/
21:58:05.217 INFO  IndexFeatureFile - Executing as root@0e764f05c628 on Linux v5.11.0-1027-aws amd64
21:58:05.217 INFO  IndexFeatureFile - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
21:58:05.217 INFO  IndexFeatureFile - Start Date/Time: February 15, 2022 9:58:05 PM GMT
21:58:05.217 INFO  IndexFeatureFile - ------------------------------------------------------------
21:58:05.218 INFO  IndexFeatureFile - ------------------------------------------------------------
21:58:05.218 INFO  IndexFeatureFile - HTSJDK Version: 2.15.1
21:58:05.218 INFO  IndexFeatureFile - Picard Version: 2.18.2
21:58:05.218 INFO  IndexFeatureFile - HTSJDK Defaults.COMPRESSION_LEVEL : 2
21:58:05.218 INFO  IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
21:58:05.218 INFO  IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
21:58:05.218 INFO  IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
21:58:05.218 INFO  IndexFeatureFile - Deflater: IntelDeflater
21:58:05.218 INFO  IndexFeatureFile - Inflater: IntelInflater
21:58:05.218 INFO  IndexFeatureFile - GCS max retries/reopens: 20
21:58:05.218 INFO  IndexFeatureFile - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
21:58:05.218 INFO  IndexFeatureFile - Initializing engine
21:58:05.218 INFO  IndexFeatureFile - Done initializing engine
21:58:05.478 INFO  FeatureManager - Using codec VCFCodec to read file file:///tmp/child_parsed.vcf.gz
21:58:05.488 INFO  ProgressMeter - Starting traversal
21:58:05.489 INFO  ProgressMeter -        Current Locus  Elapsed Minutes     Records Processed   Records/Minute
21:58:08.810 INFO  IndexFeatureFile - Shutting down engine
[February 15, 2022 9:58:08 PM GMT] org.broadinstitute.hellbender.tools.IndexFeatureFile done. Elapsed time: 0.06 minutes.
Runtime.totalMemory()=1684537344
***********************************************************************

A USER ERROR has occurred: Error while trying to create index for /tmp/child_parsed.vcf.gz. Error was: htsjdk.tribble.TribbleException: Line 155503: there aren't enough columns for line chr20 5854872 20:5854877:C:CTTTTTTTT  C   CTTTTTTTT   .   PASS    AC=6;AC_AFR=0;AC_AMR=0;AC_EAS=0;AC_EUR=0;AC_Het=18;AC_Het_AFR=0;AC_Het_AMR=0;AC_Het_EAS=0;AC_Het_EUR=0;AC_Het_SAS=18;AC_Hom=0;AC_Hom_AFR=0;AC_Hom_AMR=0;AC_Hom_EAS=0;AC_Hom_EUR=0;AC_Hom_SAS=0;AC_SAS=18;AF=0.00281074;AF_AFR=0;AF_AMR=0;AF_EAS=0;AF_EUR=0;AF_SAS=0.014975;AN=3586;AN_AFR=1786;AN_AMR=980;AN_EAS=1170;AN_EUR=1266;AN_SAS=1202;BaseQRankSum=0.133;ClippingRankSum=0.028;DP=80308;FS=0;HWE=1;HWE_AFR=1;HWE_AMR=1;HWE_EAS=1;HWE_EUR=1;HWE_SAS=1;Inbreedi (we expected 9 tokens, and saw 8 )

***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
/root/miniconda3/bin//gatk:80: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 0 or (len(args) is 1 and (args[0] == "--help" or args[0] == "-h")):
/root/miniconda3/bin//gatk:80: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 0 or (len(args) is 1 and (args[0] == "--help" or args[0] == "-h")):
/root/miniconda3/bin//gatk:117: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 1 and args[0] == "--list":
/root/miniconda3/bin//gatk:301: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if call(["gsutil", "-q", "stat", gcsjar]) is 0:
/root/miniconda3/bin//gatk:305: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if call(["gsutil", "cp", jar, gcsjar]) is 0:
/root/miniconda3/bin//gatk:458: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if not len(properties) is 0:
/root/miniconda3/bin//gatk:462: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if not len(filesToAdd) is 0:
Using GATK jar /root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar IndexFeatureFile -F /tmp/paternal_parsed.vcf.gz
21:58:11.036 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
21:58:11.111 INFO  IndexFeatureFile - ------------------------------------------------------------
21:58:11.111 INFO  IndexFeatureFile - The Genome Analysis Toolkit (GATK) v4.0.5.1
21:58:11.111 INFO  IndexFeatureFile - For support and documentation go to https://software.broadinstitute.org/gatk/
21:58:11.111 INFO  IndexFeatureFile - Executing as root@0e764f05c628 on Linux v5.11.0-1027-aws amd64
21:58:11.111 INFO  IndexFeatureFile - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
21:58:11.112 INFO  IndexFeatureFile - Start Date/Time: February 15, 2022 9:58:11 PM GMT
21:58:11.112 INFO  IndexFeatureFile - ------------------------------------------------------------
21:58:11.112 INFO  IndexFeatureFile - ------------------------------------------------------------
21:58:11.112 INFO  IndexFeatureFile - HTSJDK Version: 2.15.1
21:58:11.112 INFO  IndexFeatureFile - Picard Version: 2.18.2
21:58:11.112 INFO  IndexFeatureFile - HTSJDK Defaults.COMPRESSION_LEVEL : 2
21:58:11.112 INFO  IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
21:58:11.112 INFO  IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
21:58:11.112 INFO  IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
21:58:11.112 INFO  IndexFeatureFile - Deflater: IntelDeflater
21:58:11.112 INFO  IndexFeatureFile - Inflater: IntelInflater
21:58:11.113 INFO  IndexFeatureFile - GCS max retries/reopens: 20
21:58:11.113 INFO  IndexFeatureFile - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
21:58:11.113 INFO  IndexFeatureFile - Initializing engine
21:58:11.113 INFO  IndexFeatureFile - Done initializing engine
21:58:11.375 INFO  FeatureManager - Using codec VCFCodec to read file file:///tmp/paternal_parsed.vcf.gz
21:58:11.386 INFO  ProgressMeter - Starting traversal
21:58:11.386 INFO  ProgressMeter -        Current Locus  Elapsed Minutes     Records Processed   Records/Minute
21:58:14.688 INFO  IndexFeatureFile - Shutting down engine
[February 15, 2022 9:58:14 PM GMT] org.broadinstitute.hellbender.tools.IndexFeatureFile done. Elapsed time: 0.06 minutes.
Runtime.totalMemory()=1819279360
***********************************************************************

A USER ERROR has occurred: Error while trying to create index for /tmp/paternal_parsed.vcf.gz. Error was: htsjdk.tribble.TribbleException: Line 155503: there aren't enough columns for line chr20  5854872 20:5854877:C:CTTTTTTTT  C   CTTTTTTTT   .   PASS    AC=6;AC_AFR=0;AC_AMR=0;AC_EAS=0;AC_EUR=0;AC_Het=18;AC_Het_AFR=0;AC_Het_AMR=0;AC_Het_EAS=0;AC_Het_EUR=0;AC_Het_SAS=18;AC_Hom=0;AC_Hom_AFR=0;AC_Hom_AMR=0;AC_Hom_EAS=0;AC_Hom_EUR=0;AC_Hom_SAS=0;AC_SAS=18;AF=0.00281074;AF_AFR=0;AF_AMR=0;AF_EAS=0;AF_EUR=0;AF_SAS=0.014975;AN=3586;AN_AFR=1786;AN_AMR=980;AN_EAS=1170;AN_EUR=1266;AN_SAS=1202;BaseQRankSum=0.133;ClippingRankSum=0.028;DP=80308;FS=0;HWE=1;HWE_AFR=1;HWE_AMR=1;HWE_EAS=1;HWE_EUR=1;HWE_SAS=1;Inbreedi (we expected 9 tokens, and saw 8 )

***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
/root/miniconda3/bin//gatk:80: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 0 or (len(args) is 1 and (args[0] == "--help" or args[0] == "-h")):
/root/miniconda3/bin//gatk:80: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 0 or (len(args) is 1 and (args[0] == "--help" or args[0] == "-h")):
/root/miniconda3/bin//gatk:117: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 1 and args[0] == "--list":
/root/miniconda3/bin//gatk:301: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if call(["gsutil", "-q", "stat", gcsjar]) is 0:
/root/miniconda3/bin//gatk:305: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if call(["gsutil", "cp", jar, gcsjar]) is 0:
/root/miniconda3/bin//gatk:458: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if not len(properties) is 0:
/root/miniconda3/bin//gatk:462: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if not len(filesToAdd) is 0:
Using GATK jar /root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar IndexFeatureFile -F /tmp/maternal_parsed.vcf.gz
21:58:16.505 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
21:58:16.582 INFO  IndexFeatureFile - ------------------------------------------------------------
21:58:16.582 INFO  IndexFeatureFile - The Genome Analysis Toolkit (GATK) v4.0.5.1
21:58:16.582 INFO  IndexFeatureFile - For support and documentation go to https://software.broadinstitute.org/gatk/
21:58:16.582 INFO  IndexFeatureFile - Executing as root@0e764f05c628 on Linux v5.11.0-1027-aws amd64
21:58:16.582 INFO  IndexFeatureFile - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
21:58:16.583 INFO  IndexFeatureFile - Start Date/Time: February 15, 2022 9:58:16 PM GMT
21:58:16.583 INFO  IndexFeatureFile - ------------------------------------------------------------
21:58:16.583 INFO  IndexFeatureFile - ------------------------------------------------------------
21:58:16.583 INFO  IndexFeatureFile - HTSJDK Version: 2.15.1
21:58:16.583 INFO  IndexFeatureFile - Picard Version: 2.18.2
21:58:16.583 INFO  IndexFeatureFile - HTSJDK Defaults.COMPRESSION_LEVEL : 2
21:58:16.583 INFO  IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
21:58:16.583 INFO  IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
21:58:16.583 INFO  IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
21:58:16.583 INFO  IndexFeatureFile - Deflater: IntelDeflater
21:58:16.584 INFO  IndexFeatureFile - Inflater: IntelInflater
21:58:16.584 INFO  IndexFeatureFile - GCS max retries/reopens: 20
21:58:16.584 INFO  IndexFeatureFile - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
21:58:16.584 INFO  IndexFeatureFile - Initializing engine
21:58:16.584 INFO  IndexFeatureFile - Done initializing engine
21:58:16.841 INFO  FeatureManager - Using codec VCFCodec to read file file:///tmp/maternal_parsed.vcf.gz
21:58:16.852 INFO  ProgressMeter - Starting traversal
21:58:16.852 INFO  ProgressMeter -        Current Locus  Elapsed Minutes     Records Processed   Records/Minute
21:58:20.112 INFO  IndexFeatureFile - Shutting down engine
[February 15, 2022 9:58:20 PM GMT] org.broadinstitute.hellbender.tools.IndexFeatureFile done. Elapsed time: 0.06 minutes.
Runtime.totalMemory()=1684013056
***********************************************************************

A USER ERROR has occurred: Error while trying to create index for /tmp/maternal_parsed.vcf.gz. Error was: htsjdk.tribble.TribbleException: Line 155503: there aren't enough columns for line chr20  5854872 20:5854877:C:CTTTTTTTT  C   CTTTTTTTT   .   PASS    AC=6;AC_AFR=0;AC_AMR=0;AC_EAS=0;AC_EUR=0;AC_Het=18;AC_Het_AFR=0;AC_Het_AMR=0;AC_Het_EAS=0;AC_Het_EUR=0;AC_Het_SAS=18;AC_Hom=0;AC_Hom_AFR=0;AC_Hom_AMR=0;AC_Hom_EAS=0;AC_Hom_EUR=0;AC_Hom_SAS=0;AC_SAS=18;AF=0.00281074;AF_AFR=0;AF_AMR=0;AF_EAS=0;AF_EUR=0;AF_SAS=0.014975;AN=3586;AN_AFR=1786;AN_AMR=980;AN_EAS=1170;AN_EUR=1266;AN_SAS=1202;BaseQRankSum=0.133;ClippingRankSum=0.028;DP=80308;FS=0;HWE=1;HWE_AFR=1;HWE_AMR=1;HWE_EAS=1;HWE_EUR=1;HWE_SAS=1;Inbreedi (we expected 9 tokens, and saw 8 )

***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
/root/miniconda3/bin//gatk:80: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 0 or (len(args) is 1 and (args[0] == "--help" or args[0] == "-h")):
/root/miniconda3/bin//gatk:80: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 0 or (len(args) is 1 and (args[0] == "--help" or args[0] == "-h")):
/root/miniconda3/bin//gatk:117: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 1 and args[0] == "--list":
/root/miniconda3/bin//gatk:301: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if call(["gsutil", "-q", "stat", gcsjar]) is 0:
/root/miniconda3/bin//gatk:305: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if call(["gsutil", "cp", jar, gcsjar]) is 0:
/root/miniconda3/bin//gatk:458: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if not len(properties) is 0:
/root/miniconda3/bin//gatk:462: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if not len(filesToAdd) is 0:
Using GATK jar /root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar CombineGVCFs -R /fasta_references/Homo_sapiens_assembly38.fasta -V /tmp/child_parsed.vcf.gz -V /tmp/paternal_parsed.vcf.gz -V /tmp/maternal_parsed.vcf.gz -O /tmp/combined.vcf.gz
21:58:22.164 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
21:58:22.242 INFO  CombineGVCFs - ------------------------------------------------------------
21:58:22.243 INFO  CombineGVCFs - The Genome Analysis Toolkit (GATK) v4.0.5.1
21:58:22.243 INFO  CombineGVCFs - For support and documentation go to https://software.broadinstitute.org/gatk/
21:58:22.243 INFO  CombineGVCFs - Executing as root@0e764f05c628 on Linux v5.11.0-1027-aws amd64
21:58:22.243 INFO  CombineGVCFs - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
21:58:22.243 INFO  CombineGVCFs - Start Date/Time: February 15, 2022 9:58:22 PM GMT
21:58:22.243 INFO  CombineGVCFs - ------------------------------------------------------------
21:58:22.243 INFO  CombineGVCFs - ------------------------------------------------------------
21:58:22.244 INFO  CombineGVCFs - HTSJDK Version: 2.15.1
21:58:22.244 INFO  CombineGVCFs - Picard Version: 2.18.2
21:58:22.244 INFO  CombineGVCFs - HTSJDK Defaults.COMPRESSION_LEVEL : 2
21:58:22.244 INFO  CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
21:58:22.244 INFO  CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
21:58:22.244 INFO  CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
21:58:22.244 INFO  CombineGVCFs - Deflater: IntelDeflater
21:58:22.244 INFO  CombineGVCFs - Inflater: IntelInflater
21:58:22.244 INFO  CombineGVCFs - GCS max retries/reopens: 20
21:58:22.244 INFO  CombineGVCFs - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
21:58:22.244 INFO  CombineGVCFs - Initializing engine
21:58:22.602 INFO  FeatureManager - Using codec VCFCodec to read file file:///tmp/child_parsed.vcf.gz
21:58:22.610 INFO  CombineGVCFs - Shutting down engine
[February 15, 2022 9:58:22 PM GMT] org.broadinstitute.hellbender.tools.walkers.CombineGVCFs done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=1212678144
***********************************************************************

A USER ERROR has occurred: An index is required but was not found for file /tmp/child_parsed.vcf.gz. Support for unindexed block-compressed files has been temporarily disabled. Try running IndexFeatureFile on the input.

***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
Trio has been combined and written to a temporary file. Time elapsed: 0.32 minutes.
/root/miniconda3/bin//gatk:80: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 0 or (len(args) is 1 and (args[0] == "--help" or args[0] == "-h")):
/root/miniconda3/bin//gatk:80: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 0 or (len(args) is 1 and (args[0] == "--help" or args[0] == "-h")):
/root/miniconda3/bin//gatk:117: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 1 and args[0] == "--list":
/root/miniconda3/bin//gatk:301: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if call(["gsutil", "-q", "stat", gcsjar]) is 0:
/root/miniconda3/bin//gatk:305: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if call(["gsutil", "cp", jar, gcsjar]) is 0:
/root/miniconda3/bin//gatk:458: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if not len(properties) is 0:
/root/miniconda3/bin//gatk:462: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if not len(filesToAdd) is 0:
Using GATK jar /root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar IndexFeatureFile -F /tmp/combined.vcf.gz
21:58:24.413 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
21:58:24.490 INFO  IndexFeatureFile - ------------------------------------------------------------
21:58:24.491 INFO  IndexFeatureFile - The Genome Analysis Toolkit (GATK) v4.0.5.1
21:58:24.491 INFO  IndexFeatureFile - For support and documentation go to https://software.broadinstitute.org/gatk/
21:58:24.491 INFO  IndexFeatureFile - Executing as root@0e764f05c628 on Linux v5.11.0-1027-aws amd64
21:58:24.491 INFO  IndexFeatureFile - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
21:58:24.491 INFO  IndexFeatureFile - Start Date/Time: February 15, 2022 9:58:24 PM GMT
21:58:24.491 INFO  IndexFeatureFile - ------------------------------------------------------------
21:58:24.491 INFO  IndexFeatureFile - ------------------------------------------------------------
21:58:24.492 INFO  IndexFeatureFile - HTSJDK Version: 2.15.1
21:58:24.492 INFO  IndexFeatureFile - Picard Version: 2.18.2
21:58:24.492 INFO  IndexFeatureFile - HTSJDK Defaults.COMPRESSION_LEVEL : 2
21:58:24.492 INFO  IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
21:58:24.492 INFO  IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
21:58:24.492 INFO  IndexFeatureFile - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
21:58:24.492 INFO  IndexFeatureFile - Deflater: IntelDeflater
21:58:24.492 INFO  IndexFeatureFile - Inflater: IntelInflater
21:58:24.492 INFO  IndexFeatureFile - GCS max retries/reopens: 20
21:58:24.492 INFO  IndexFeatureFile - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
21:58:24.492 INFO  IndexFeatureFile - Initializing engine
21:58:24.492 INFO  IndexFeatureFile - Done initializing engine
21:58:24.493 INFO  IndexFeatureFile - Shutting down engine
[February 15, 2022 9:58:24 PM GMT] org.broadinstitute.hellbender.tools.IndexFeatureFile done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=819462144
***********************************************************************

A USER ERROR has occurred: Couldn't read file /tmp/combined.vcf.gz

***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
/root/miniconda3/bin//gatk:80: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 0 or (len(args) is 1 and (args[0] == "--help" or args[0] == "-h")):
/root/miniconda3/bin//gatk:80: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 0 or (len(args) is 1 and (args[0] == "--help" or args[0] == "-h")):
/root/miniconda3/bin//gatk:117: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(args) is 1 and args[0] == "--list":
/root/miniconda3/bin//gatk:301: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if call(["gsutil", "-q", "stat", gcsjar]) is 0:
/root/miniconda3/bin//gatk:305: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if call(["gsutil", "cp", jar, gcsjar]) is 0:
/root/miniconda3/bin//gatk:458: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if not len(properties) is 0:
/root/miniconda3/bin//gatk:462: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if not len(filesToAdd) is 0:
Using GATK jar /root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx4g -jar /root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar GenotypeGVCFs -R /fasta_references/Homo_sapiens_assembly38.fasta -V /tmp/combined.vcf.gz -O /tmp/genotyped.vcf.gz
21:58:26.445 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/root/miniconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
21:58:26.520 INFO  GenotypeGVCFs - ------------------------------------------------------------
21:58:26.520 INFO  GenotypeGVCFs - The Genome Analysis Toolkit (GATK) v4.0.5.1
21:58:26.520 INFO  GenotypeGVCFs - For support and documentation go to https://software.broadinstitute.org/gatk/
21:58:26.520 INFO  GenotypeGVCFs - Executing as root@0e764f05c628 on Linux v5.11.0-1027-aws amd64
21:58:26.520 INFO  GenotypeGVCFs - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
21:58:26.521 INFO  GenotypeGVCFs - Start Date/Time: February 15, 2022 9:58:26 PM GMT
21:58:26.521 INFO  GenotypeGVCFs - ------------------------------------------------------------
21:58:26.521 INFO  GenotypeGVCFs - ------------------------------------------------------------
21:58:26.521 INFO  GenotypeGVCFs - HTSJDK Version: 2.15.1
21:58:26.521 INFO  GenotypeGVCFs - Picard Version: 2.18.2
21:58:26.521 INFO  GenotypeGVCFs - HTSJDK Defaults.COMPRESSION_LEVEL : 2
21:58:26.521 INFO  GenotypeGVCFs - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
21:58:26.521 INFO  GenotypeGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
21:58:26.522 INFO  GenotypeGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
21:58:26.522 INFO  GenotypeGVCFs - Deflater: IntelDeflater
21:58:26.522 INFO  GenotypeGVCFs - Inflater: IntelInflater
21:58:26.522 INFO  GenotypeGVCFs - GCS max retries/reopens: 20
21:58:26.522 INFO  GenotypeGVCFs - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
21:58:26.522 INFO  GenotypeGVCFs - Initializing engine
21:58:26.858 INFO  GenotypeGVCFs - Shutting down engine
[February 15, 2022 9:58:26 PM GMT] org.broadinstitute.hellbender.tools.walkers.GenotypeGVCFs done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=1225261056
***********************************************************************

A USER ERROR has occurred: Couldn't read file file:///tmp/combined.vcf.gz. Error was: It doesn't exist.

***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
Trio has been join-genotyped. Time elapsed: 0.07 minutes.
Traceback (most recent call last):
  File "/trio_phaser.py", line 346, in <module>
    with gzip.open(temp_genotyped_name, "rt") as vcf:
  File "/usr/lib/python3.6/gzip.py", line 53, in open
    binary_file = GzipFile(filename, gz_mode, compresslevel)
  File "/usr/lib/python3.6/gzip.py", line 163, in __init__
    fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/genotyped.vcf.gz'
dmiller903 commented 2 years ago

The files you are using as input do not appear to be gVCF files as far as I can tell from the error GATK is outputting. GATK says there are only 8 columns in the input vcf files. It looks like the INFO column is missing. gVCF files include an INFO column that tell GATK where the variant block ends. trioPhaser requires gVCF files in order to run properly. trioPhaser uses GATK’s CombineGVCFs and GenotypeGVCFs tools which require gVCFs in order to function properly. Therefore, if a gVCF is not used as input, then no genotyped.vcf.gz file will be generated, because no combined gVCF was created, causing subsequent issues.

dmiller903 commented 2 years ago

Haven’t heard back, so I’m assuming this was in fact an input issue. gVCF files must be used as input.