icbi-lab / nextNEOpi

nextNEOpi: a comprehensive pipeline for computational neoantigen prediction
Other
65 stars 23 forks source link

CNNscoreVariant terminated with an error exit status (2) #46

Closed rashidma closed 10 months ago

rashidma commented 10 months ago

I have installed nextNEOpi latest version from github and all its requirements. I am using Ubuntu 18.04.

  1. nextflow -version N E X T F L O W version 23.04.3 build 5875 created 11-08-2023 18:37 UTC (21:37 ADT) cite doi:10.1038/nbt.3820 http://nextflow.io/
  2. java -version openjdk version "11.0.19" 2023-04-18 OpenJDK Runtime Environment (build 11.0.19+7-post-Ubuntu-0ubuntu118.04.1) OpenJDK 64-Bit Server VM (build 11.0.19+7-post-Ubuntu-0ubuntu118.04.1, mixed mode, sharing)
  3. singularity --version singularity-ce version 3.11.3

I got below error, that i think related to TensorFlow. Please help me troubleshoot.

_Error executing process > 'CNNScoreVariants (test)'

Caused by: Process CNNScoreVariants (test) terminated with an error exit status (2)

Command executed:

mkdir -p /tmp/mamoon/nextNEOpi

gatk CNNScoreVariants \ --tmp-dir /tmp/mamoon/nextNEOpi \ -R GRCh38.d1.vd1.fa \ -I test_normal_DNA_recalibrated.bam \ -V test_germline_0007-scattered.interval_list.vcf.gz \ -tensor-type read_tensor \ --inter-op-threads 2 \ --intra-op-threads 2 \ --transfer-batch-size 256 \ --inference-batch-size 128 \ -O test_germline_0007-scattered.interval_list.vcf_CNNScored.vcf.gz

Command exit status: 2

Command output: (empty)

Command error: 10:22:37.702 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/opt/gatk/gatk-package-4.4.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so 10:22:37.832 INFO CNNScoreVariants - ------------------------------------------------------------ 10:22:37.874 INFO CNNScoreVariants - The Genome Analysis Toolkit (GATK) v4.4.0.0 10:22:37.874 INFO CNNScoreVariants - For support and documentation go to https://software.broadinstitute.org/gatk/ 10:22:37.875 INFO CNNScoreVariants - Executing as mambauser@mamoon-T7500 on Linux v5.4.0-105-generic amd64 10:22:37.875 INFO CNNScoreVariants - Java runtime: OpenJDK 64-Bit Server VM v17.0.7+7-Debian-1deb11u1 10:22:37.875 INFO CNNScoreVariants - Start Date/Time: September 18, 2023 at 10:22:37 AM UTC 10:22:37.875 INFO CNNScoreVariants - ------------------------------------------------------------ 10:22:37.875 INFO CNNScoreVariants - ------------------------------------------------------------ 10:22:37.876 INFO CNNScoreVariants - HTSJDK Version: 3.0.5 10:22:37.877 INFO CNNScoreVariants - Picard Version: 3.0.0 10:22:37.877 INFO CNNScoreVariants - Built for Spark Version: 3.3.1 10:22:37.877 INFO CNNScoreVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2 10:22:37.877 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false 10:22:37.878 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true 10:22:37.878 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false 10:22:37.878 INFO CNNScoreVariants - Deflater: IntelDeflater 10:22:37.878 INFO CNNScoreVariants - Inflater: IntelInflater 10:22:37.879 INFO CNNScoreVariants - GCS max retries/reopens: 20 10:22:37.879 INFO CNNScoreVariants - Requester pays: disabled 10:22:37.879 INFO CNNScoreVariants - Initializing engine 10:22:38.427 INFO FeatureManager - Using codec VCFCodec to read file file://test_germline_0007-scattered.interval_list.vcf.gz 10:22:38.444 WARN IntelInflater - Zero Bytes Written : 0 10:22:38.496 WARN IntelInflater - Zero Bytes Written : 0 10:22:38.565 INFO CNNScoreVariants - Done initializing engine 10:22:38.566 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/opt/gatk/gatk-package-4.4.0.0-local.jar!/com/intel/gkl/native/libgkl_utils.so 10:22:38.646 INFO CNNScoreVariants - Done scoring variants with CNN. 10:22:38.646 INFO CNNScoreVariants - Shutting down engine [September 18, 2023 at 10:22:38 AM UTC] org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants done. Elapsed time: 0.02 minutes. Runtime.totalMemory()=260046848


A USER ERROR has occurred: This tool requires AVX instruction set support by default due to its dependency on recent versions of the TensorFlow library. If you have an older (pre-1.6) version of TensorFlow installed that does not require AVX you may attempt to re-run the tool with the disable-avx-check argument to bypass this check. Note that such configurations are not officially supported.


Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace. Using GATK jar /opt/gatk/gatk-package-4.4.0.0-local.jar Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /opt/gatk/gatk-package-4.4.0.0-local.jar CNNScoreVariants --tmp-dir /tmp/mamoon/nextNEOpi -R GRCh38.d1.vd1.fa -I test_normal_DNA_recalibrated.bam -V test_germline_0007-scattered.interval_list.vcf.gz -tensor-type read_tensor --inter-op-threads 2 --intra-op-threads 2 --transfer-batch-size 256 --inference-batch-size 128 -O test_germline_0007-scattered.interval_list.vcfCNNScored.vcf.gz

Thank you for your effort

riederd commented 10 months ago

Hi,

it seems that the CPU you are using does not have the AVX instruction set required by GATK CNNScoreVariants. Which type of CPU is this?

What is the output of: cat /proc/cpuinfo | grep "model name" and cat /proc/cpuinfo | grep avx

rashidma commented 10 months ago

Hello, yes i think so. Information are below:

cat /proc/cpuinfo | grep "model name" model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz

cat /proc/cpuinfo | grep avx

No output. I need to run this pipeline asap. So, is this pipeline available as conda package or environment?? Thanks

riederd commented 10 months ago

Unfortunately your type of CPU is not supported by GATK CNNScoreVariants since lacks the AVX instruction set. The Intel(R) Xeon(R) CPU X5650 is about 13 years old, maybe you have access to a newer system with a more up-to-date CPU.

rashidma commented 10 months ago

Thank you for this information. What about the below CPU? model name : Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz

I have access to a HPC with 16 such CPUs .

Best regards

rashidma commented 10 months ago

I checked on Intel website. The E5-2620 v4 was launched in 2016 and supports AVX instruction set as mentioned below: Instruction Set Extensions Intel® AVX2

Thanks

riederd commented 10 months ago

This model should work!