HKU-BAL / Clair3

Clair3 - Symphonizing pileup and full-alignment for high-performance long-read variant calling
246 stars 27 forks source link

Clair3 docker numpy compatibility #313

Closed DNA-Dave closed 4 months ago

DNA-Dave commented 5 months ago

Hello,

I am trying to run clair3 using the docker image running on singularity. I have gotten this to work in the past with no issues, but today when I ran my script (which I did not change since the last successful run some time ago), I am seeing errors with the numpy version being not compatible. Error log is attached below. I am hoping that this bug can be patched soon. Many thanks.

+ singularity pull docker://hkubal/clair3:latest
INFO:    Using cached SIF image
+ singularity exec -B /mnt/isilon/xing_lab/aspera/wud3/BDB clair3_latest.sif /opt/bin/run_clair3.sh --bam_fn=/mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332.bam --ref_fn=/mnt/isilon/xing_lab/aspera/wud3/BDB/clair3/inputs/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna --threads=18 --platform=ont --model_path=/mnt/isilon/xing_lab/aspera/wud3/BDB/clair3/inputs/rerio/clair3_models/r1041_e82_400bps_sup_v500 --output=/mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332/output --vcf_fn=/mnt/isilon/xing_lab/aspera/wud3/BDB/clair3/inputs/clair3.vcf
INFO:    Converting SIF file to temporary sandbox...
[INFO] CLAIR3 VERSION: v1.0.9
[INFO] BAM FILE PATH: /mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332.bam
[INFO] REFERENCE FILE PATH: /mnt/isilon/xing_lab/aspera/wud3/BDB/clair3/inputs/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna
[INFO] MODEL PATH: /mnt/isilon/xing_lab/aspera/wud3/BDB/clair3/inputs/rerio/clair3_models/r1041_e82_400bps_sup_v500
[INFO] OUTPUT FOLDER: /mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332/output
[INFO] PLATFORM: ont
[INFO] THREADS: 18
[INFO] BED FILE PATH: EMPTY
[INFO] VCF FILE PATH: /mnt/isilon/xing_lab/aspera/wud3/BDB/clair3/inputs/clair3.vcf
[INFO] CONTIGS: EMPTY
[INFO] CONDA PREFIX: /home/wud3/anaconda3/envs/main/envs/singularity-env
[INFO] SAMTOOLS PATH: samtools
[INFO] PYTHON PATH: python3
[INFO] PYPY PATH: pypy3
[INFO] PARALLEL PATH: parallel
[INFO] WHATSHAP PATH: whatshap
[INFO] LONGPHASE PATH: EMPTY
[INFO] CHUNK SIZE: 5000000
[INFO] FULL ALIGN PROPORTION: 0.7
[INFO] FULL ALIGN REFERENCE PROPORTION: 0.1
[INFO] PHASING PROPORTION: 0.7
[INFO] MINIMUM MQ: 5
[INFO] MINIMUM COVERAGE: 2
[INFO] SNP AF THRESHOLD: 0.08
[INFO] INDEL AF THRESHOLD: 0.15
[INFO] BASE ERROR IN GVCF: 0.001
[INFO] GQ BIN SIZE IN GVCF: 5
[INFO] ENABLE FILEUP ONLY CALLING: False
[INFO] ENABLE FAST MODE CALLING: False
[INFO] ENABLE CALLING SNP CANDIDATES ONLY: False
[INFO] ENABLE PRINTING REFERENCE CALLS: False
[INFO] ENABLE OUTPUT GVCF: False
[INFO] ENABLE HAPLOID PRECISE MODE: False
[INFO] ENABLE HAPLOID SENSITIVE MODE: False
[INFO] ENABLE INCLUDE ALL CTGS CALLING: False
[INFO] ENABLE NO PHASING FOR FULL ALIGNMENT: False
[INFO] ENABLE REMOVING INTERMEDIATE FILES: False
[INFO] ENABLE LONGPHASE FOR INTERMEDIATE VCF PHASING: False
[INFO] ENABLE PHASING FINAL VCF OUTPUT USING WHATSHAP: False
[INFO] ENABLE PHASING FINAL VCF OUTPUT USING LONGPHASE: False
[INFO] ENABLE HAPLOTAGGING FINAL BAM: False
[INFO] ENABLE LONG INDEL CALLING: False
[INFO] ENABLE C_IMPLEMENT: True

+ /opt/bin/scripts/clair3_c_impl.sh --bam_fn /mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332.bam --ref_fn /mnt/isilon/xing_lab/aspera/wud3/BDB/clair3/inputs/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna --threads 18 --model_path /mnt/isilon/xing_lab/aspera/wud3/BDB/clair3/inputs/rerio/clair3_models/r1041_e82_400bps_sup_v500 --platform ont --output /mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332/output --bed_fn=EMPTY --vcf_fn=/mnt/isilon/xing_lab/aspera/wud3/BDB/clair3/inputs/clair3.vcf --ctg_name=EMPTY --sample_name=SAMPLE --chunk_num=0 --chunk_size=5000000 --samtools=samtools --python=python3 --pypy=pypy3 --parallel=parallel --whatshap=whatshap --qual=2 --var_pct_full=0.7 --ref_pct_full=0.1 --var_pct_phasing=0.7 --snp_min_af=0.0 --indel_min_af=0.0 --min_mq=5 --min_coverage=2 --min_contig_size=0 --pileup_only=False --gvcf=False --base_err=0.001 --gq_bin_size=5 --fast_mode=False --call_snp_only=False --print_ref_calls=False --haploid_precise=False --haploid_sensitive=False --include_all_ctgs=False --no_phasing_for_fa=False --pileup_model_prefix=pileup --fa_model_prefix=full_alignment --remove_intermediate_dir=False --enable_phasing=False --enable_long_indel=False --keep_iupac_bases=False --use_gpu=False --longphase_for_phasing=False --longphase=EMPTY --use_whatshap_for_intermediate_phasing=True --use_longphase_for_intermediate_phasing=False --use_whatshap_for_final_output_phasing=False --use_longphase_for_final_output_phasing=False --use_whatshap_for_final_output_haplotagging=False

[INFO] Check environment variables
[INFO] Create folder /mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332/output/log
[INFO] Create folder /mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332/output/tmp/split_beds
[INFO] Create folder /mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332/output/tmp/pileup_output
[INFO] Create folder /mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332/output/tmp/merge_output
[INFO] Create folder /mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332/output/tmp/phase_output
[INFO] Create folder /mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332/output/tmp/gvcf_tmp_output
[INFO] Create folder /mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332/output/tmp/full_alignment_output
[INFO] Create folder /mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332/output/tmp/phase_output/phase_vcf
[INFO] Create folder /mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332/output/tmp/phase_output/phase_bam
[INFO] Create folder /mnt/isilon/xing_lab/aspera/wud3/BDB/BDB_91_cohort/clair3/BDB_PBMC/1017332/output/tmp/full_alignment_output/candidate_bed
[INFO] --include_all_ctgs not enabled, use chr{1..22,X,Y} and {1..22,X,Y} by default
[INFO] Call variant in contigs: chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr20 chr21 chr22 chrX chrY
[INFO] Chunk number for each contig: 50 49 40 39 37 35 32 30 28 27 28 27 23 22 21 19 17 17 12 13 10 11 32 12
[INFO] 1/7 Call variants using pileup model

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.0 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "/opt/bin/scripts/../clair3.py", line 105, in <module>
    main()
  File "/opt/bin/scripts/../clair3.py", line 92, in main
    submodule = import_module("%s.%s" % (directory, submodule_name))
  File "/opt/conda/envs/clair3/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/opt/bin/clair3/CallVariantsFromCffi.py", line 3, in <module>
    import tensorflow as tf
  File "/opt/conda/envs/clair3/lib/python3.9/site-packages/tensorflow/__init__.py", line 37, in <module>
    from tensorflow.python.tools import module_util as _module_util
  File "/opt/conda/envs/clair3/lib/python3.9/site-packages/tensorflow/python/__init__.py", line 37, in <module>
    from tensorflow.python.eager import context
  File "/opt/conda/envs/clair3/lib/python3.9/site-packages/tensorflow/python/eager/context.py", line 35, in <module>
    from tensorflow.python.client import pywrap_tf_session
  File "/opt/conda/envs/clair3/lib/python3.9/site-packages/tensorflow/python/client/pywrap_tf_session.py", line 19, in <module>
    from tensorflow.python.client._pywrap_tf_session import *
AttributeError: _ARRAY_API not found

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.0 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
aquaskyline commented 5 months ago

We tried on a few machines on our side but cannot repeat the problem Appreciate if you can provide more hints. On the other hand, to make it run on your side, you might want to try downgrading the numpy in the docker image by pip install numpy==1.24.3.

DNA-Dave commented 4 months ago

@aquaskyline seems like doing a pip install numpy==1.24.3 fixed the issue. Thanks!