google / deepvariant

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
BSD 3-Clause "New" or "Revised" License
3.16k stars 713 forks source link

Installing deepvariant using Docker Desktop on a Mac (apple silicon - M1/M2) #657

Closed heznanda closed 1 year ago

heznanda commented 1 year ago

Goal: Installing deepvariant using Docker Desktop on a Mac (apple silicon - M1/M2).

I have been troubleshooting for days and to build from source, but failed to do so. Now I ended up installing Ubuntu 20.04 using mac's UTM, but still facing a lot of problems.

Is there a detailed step-by-step instruction on how to install on a mac (apple silicon)?

You mentioned: "It can likely be built and run on other unix-based systems with some minimal modifications to these scripts." from https://github.com/google/deepvariant/blob/r1.5/docs/deepvariant-build-test.md What is the "minimal modifications" in here? Changing everything about the build-prereq.sh, setting.sh, tools/build_clif.sh, and other .sh, proves to be a hard task.

Otherwise, I can try to explain the problem of Ubuntu 20.04 using mac's UTM.

Thank you for your help!

pgrosu commented 1 year ago

Out of curiosity, if you are using Docker Desktop for Apple Silicon, have you tried the --platform flag using the regular docker image for DeepVariant 1.5 based on linux/amd64? The flag used with docker run would be like this:

docker run --platform linux/amd64 google/deepvariant:1.5.0
heznanda commented 1 year ago

@pgrosu Thank you for your reply. This is the error (which started the whole troubleshooting):

The TensorFlow library was compiled to use AVX instructions, but these aren't available on your machine.
qemu: uncaught target signal 6 (Aborted) - core dumped
/opt/deepvariant/bin/run_deepvariant: line 2:     8 Aborted                 python3 -u /opt/deepvariant/bin/run_deepvariant.py "$@"
pgrosu commented 1 year ago

@heznanda Try the following to see if it fixes that requirement:

1) Install qemu and colima like this -- colima will temporarily change the Docker runtime context to colima (see below):

brew install qemu
brew install colima

2) Enable colima to be your runtime for Docker via colima start like this with this configuration:

colima start --arch x86_64 --cpu 2 --memory 2 --disk 12 --cpu-type Broadwell-v4

The above settings are in gigabytes, so adjust accordingly to what you have available on your machine.

3) Now try running the docker container for DeepVariant again.

4) After you are done, then run colima stop to change Docker back to its default configurations.

heznanda commented 1 year ago

@pgrosu Thank you for the reply! I wouldn't have thought to install qemu and colima before.

I tried your instructions and added brew install docker as well.

Successfully started colima.

> colima start --arch x86_64 --cpu 4 --memory 8 --disk 20 --cpu-type Broadwell-v4
INFO[0000] starting colima                              
INFO[0000] runtime: docker                              
INFO[0000] preparing network ...                         context=vm
INFO[0000] starting ...                                  context=vm
INFO[0073] provisioning ...                              context=docker
INFO[0074] starting ...                                  context=docker
INFO[0092] done            

The docker run docker run --platform linux/amd64 google/deepvariant:1.5.0 seems to be incomplete. The first installation, it gave this message Unable to find image 'google/deepvariant:1.5.0' locally 1.5.0: Pulling from google/deepvariant and after the pull complete message and the Status: Downloaded newer image for google/deepvariant:1.5.0, it was stucked:

2023-06-04 16:15:06.784293: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

and finally gave ERRO[1898] error waiting for container: I stopped colima and rerun docker run --platform linux/amd64 google/deepvariant:1.5.0. it has been stuck like this for over 30 minutes with the same message:

2023-06-04 16:36:19.453896: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

ERRO[3173] error waiting for container:

Do you have any suggestions?

heznanda commented 1 year ago

Actually those messages might not be a warning, "The above is not a warning and is just a point of information." https://discuss.tensorflow.org/t/tensorflow-with-proper-compiler-flag-error-message/12393/3

Now I will try with the test quickstart run.

pgrosu commented 1 year ago

The reason why it is taking so long is because of the emulation component in qemu. Docker Desktop performs the qemu emulation of those instructions, when you run it with --platform. The reason I wanted to try colima is because it will use the installed qemu on the Mac rather the one specifically shipped with Docker. The newest qemu has the AVX instructions.

How come you don't want to run it on the Cloud directly, as DeepVariant is suboptimal on a laptop with large datasets.

Digging through my notes, I noticed I had to write my own customizations to build DeepVariant from scratch for non-AVX platforms, but that's TensorFlow-specific. It's a lot of work. What specific issues are you seeing under Ubuntu when you're trying to install it?

heznanda commented 1 year ago

@pgrosu Thank you for your response. When you say "the cloud" do you mean to run it on a server/a super computer? I predicted that I will need root access (sudo) which I don't have.

  1. Is there a way to do this? Docker is not even installed there.

  2. So are you suggesting that we should not use mac (apple silicon)?

The ubuntu 20.04 that was installed on my mac (using UTM), should this work?

Thank you for your time and guidance!

pgrosu commented 1 year ago

1) Cloud has Docker and sudo access, but you have to pay for it for the time you use it (it's basically VM created instances with storage that is pay-per-usage):

https://cloud.google.com/life-sciences/docs/tutorials/deepvariant

2) If you have a cluster that you have access to, that would be perfect. You can use singularity, or if you don't have Docker there are ways to run Docker is some limited way without root access.

3) Regarding Ubuntu, tell me what you see when you run the following script under your DeepVariant folder (the one git cloned):

#!/bin/bash
source settings.sh
./run-prereq.sh
heznanda commented 1 year ago

@pgrosu From the choices above, I want to pursue (2) singularity the most.

  1. Is there an instruction that you can point me to for this?

  2. I want to show the messages I got from running Ubuntu 20.04 on Mac's UTM. settings.sh has been modified:

    export CUDNN_INSTALL_PATH="/usr/lib/aarch64-linux-gnu" #instead of /usr/lib/x86_64-linux-gnu
    # orig: export DV_USE_GCP_OPTIMIZED_TF_WHL="${DV_USE_GCP_OPTIMIZED_TF_WHL:-1}"
    export DV_USE_GCP_OPTIMIZED_TF_WHL="0" #force using CPU only tensorflow

The message when running:

> sudo su
> source settings.sh
> ./run-prereq.sh
========== This script is only maintained for Ubuntu 20.04.
========== Load config settings.
========== [Sun 04 Jun 2023 11:11:08 PM UTC] Stage 'Misc setup' starting
========== [Sun 04 Jun 2023 11:11:09 PM UTC] Stage 'Update package list' starting
========== [Sun 04 Jun 2023 11:11:10 PM UTC] Stage 'run-prereq.sh: Install development packages' starting
Calling wait_for_dpkg_lock.
========== [Sun 04 Jun 2023 11:11:10 PM UTC] Stage 'Install python3 packaging infrastructure' starting
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2518k  100 2518k    0     0  5671k      0 --:--:-- --:--:-- --:--:-- 5671k
Collecting pip
  Using cached pip-23.1.2-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 23.1.2
    Uninstalling pip-23.1.2:
      Successfully uninstalled pip-23.1.2
  WARNING: The scripts pip, pip3 and pip3.8 are installed in '/root/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pip-23.1.2
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Python 3.8.10
pip 23.1.2 from /root/.local/lib/python3.8/site-packages/pip (python 3.8)
========== [Sun 04 Jun 2023 11:11:13 PM UTC] Stage 'Install python3 packages' starting
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-cpu-aws 2.11.0 requires numpy>=1.20, but you have numpy 1.19.2 which is incompatible.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-cpu-aws 2.11.0 requires numpy>=1.20, but you have numpy 1.19.2 which is incompatible.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorboard 2.11.2 requires protobuf<4,>=3.9.2, but you have protobuf 4.23.2 which is incompatible.
tensorflow-cpu-aws 2.11.0 requires numpy>=1.20, but you have numpy 1.19.2 which is incompatible.
tensorflow-cpu-aws 2.11.0 requires protobuf<3.20,>=3.9.2, but you have protobuf 4.23.2 which is incompatible.
========== [Sun 04 Jun 2023 11:11:24 PM UTC] Stage 'Install TensorFlow pip package' starting
Installing standard CPU-only TensorFlow 2.11.0 wheel
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
werkzeug 2.3.4 requires MarkupSafe>=2.1.1, but you have markupsafe 2.0.1 which is incompatible.
========== [Sun 04 Jun 2023 11:11:26 PM UTC] Stage 'Install CUDA' starting
========== [Sun 04 Jun 2023 11:11:26 PM UTC] Stage 'Install TensorRT' starting
========== [Sun 04 Jun 2023 11:11:26 PM UTC] Stage 'Install other packages' starting
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-api-core 2.11.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.13.0 which is incompatible.
googleapis-common-protos 1.59.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.13.0 which is incompatible.
========== [Sun 04 Jun 2023 11:11:27 PM UTC] Stage 'run-prereq.sh complete' starting

The last time I tried running ./build-prereq.sh, I got error on the building of Clif. llvm-11-linker-tools not available. and now the error is:

+ echo -n 'Using Python interpreter: /usr/local/bin/python3'
Using Python interpreter: /usr/local/bin/python3+ [[ '' -eq 1 ]]
+ mkdir -p /root/clif/build
+ cd /root/clif/build
+ cmake -DPYTHON_EXECUTABLE=/usr/local/bin/python3 /root/clif
-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is GNU 9.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.1") 
-- Checking for module 'protobuf'
--   Found protobuf, version 3.13.0
-- Checking for module 'libglog'
--   Found libglog, version 0.4.0
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
CMake Error at clif/cmake/modules/CLIFUtils.cmake:37 (find_package):
  Could not find a configuration file for package "LLVM" that is compatible
  with requested version "11.1.0".

  The following configuration files were considered but not accepted:

    /usr/lib/llvm-11/lib/cmake/llvm/LLVMConfig.cmake, version: 11.0.0
    /usr/lib/llvm-11/cmake/LLVMConfig.cmake, version: 11.0.0
    /lib/llvm-11/cmake/LLVMConfig.cmake, version: 11.0.0

Call Stack (most recent call first):
  clif/CMakeLists.txt:22 (include)

-- Configuring incomplete, errors occurred!
See also "/root/clif/build/CMakeFiles/CMakeOutput.log".
See also "/root/clif/build/CMakeFiles/CMakeError.log".

real    2m44.183s
user    0m18.337s
sys 0m18.865s
pgrosu commented 1 year ago

This is very good!

For Singularity

You can take a look at the following two links:

https://github.com/google/deepvariant/blob/r1.5/docs/deeptrio-quick-start.md#notes-on-singularity

https://github.com/google/deepvariant/blob/r1.5/scripts/install_singularity.sh

For your Ubuntu instance

You are getting very close! To simplify the install in the run-prereq.sh file you can comment out (with the # symbol) the following sections:

1) For the "Install TensorFlow pip package" keep only the ones with CPU-only, and comment out the others.

2) For "Install CUDA", comment out everthing.

3) For "Install TensorRT", comment out everthing.

And then run it again. The rest of the errors in the run-prereq.sh are easy to fix, which we can do later individually by removing each one, and installing the minimum required version.

Before we fix clif, could you tell me what you get for the following:

lsb_release -sc

wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key |  apt-key add - 

add-apt-repository "deb http://apt.llvm.org/$(lsb_release -sc)/ llvm-toolchain-$(lsb_release -sc)-11 main"

sudo apt-get update

sudo apt-get install -y llvm-11 llvm-11-dev clang-11 llvm-11-tools

You might have a mismatch of a previous version of clif or its installed configuration files. You can check that via the following commands:

llvm-config-11 --version

The configs might be an older version, which you can check via the following:

cat /usr/lib/llvm-11/lib/cmake/llvm/LLVMConfig.cmake | grep PACKAGE_VERSION

Below is what I have:

$ cat /usr/lib/llvm-11/lib/cmake/llvm/LLVMConfig.cmake | grep PACKAGE_VERSION
set(LLVM_PACKAGE_VERSION 11.1.0)
$ cat /usr/lib/llvm-11/cmake/LLVMConfig.cmake | grep PACKAGE_VERSION
set(LLVM_PACKAGE_VERSION 11.1.0)
$ cat /lib/llvm-11/cmake/LLVMConfig.cmake | grep PACKAGE_VERSION
cat: /lib/llvm-11/cmake/LLVMConfig.cmake: No such file or directory
$

If you have a mismatch between the version and config, you can make a clean removal of llvm via the following:

sudo apt-get remove llvm-11*

The way clif is installed is via the following:

#!/bin/bash
source settings.sh
sudo tools/build_clif.sh

Let me know what you see.

Thanks, ~p

heznanda commented 1 year ago

@pgrosu Thank you for your guidance!! So this is what I did for the sections you mentioned:

# note_build_stage "Install TensorFlow pip package"

# if [[ "${DV_USE_PREINSTALLED_TF}" = "1" ]]; then
#   echo "Skipping TensorFlow installation at user request; will use pre-installed TensorFlow."
# else
#   # Also pip install the latest TensorFlow with cpu support. We don't build the
#   # full TF from source, but instead using prebuilt version. However, we still
#   # need the full source version to build DeepVariant.

#   # Gets the nightly TF build: https://pypi.python.org/pypi/tf-nightly which is
#   # necessary right now if we aren't pinning the TF source. We have observed
#   # runtime failures if there's too much skew between the released TF package and
#   # the source.
#   if [[ "${DV_TF_NIGHTLY_BUILD}" = "1" ]]; then
#     if [[ "${DV_GPU_BUILD}" = "1" ]]; then
#       echo "Installing GPU-enabled TensorFlow nightly wheel"
#       pip3 install "${PIP_ARGS[@]}" --upgrade tf_nightly_gpu
#     else
#       echo "Installing CPU-only TensorFlow nightly wheel"
#       pip3 install "${PIP_ARGS[@]}" --upgrade tf_nightly
#     fi
#   else
#     # Use the official TF release pip package.
#     if [[ "${DV_GPU_BUILD}" = "1" ]]; then
#       echo "Installing GPU-enabled TensorFlow ${DV_TENSORFLOW_STANDARD_GPU_WHL_VERSION} wheel"
#       pip3 install "${PIP_ARGS[@]}" --upgrade "tensorflow-gpu==${DV_TENSORFLOW_STANDARD_GPU_WHL_VERSION}"
#     elif [[ "${DV_USE_GCP_OPTIMIZED_TF_WHL}" = "1" ]]; then
#       echo "Installing Intel's CPU-only MKL TensorFlow ${DV_GCP_OPTIMIZED_TF_WHL_VERSION} wheel"
#       pip3 install "${PIP_ARGS[@]}" --upgrade "intel-tensorflow==${DV_GCP_OPTIMIZED_TF_WHL_VERSION}"
#     else
      echo "Installing standard CPU-only TensorFlow ${DV_TENSORFLOW_STANDARD_CPU_WHL_VERSION} wheel"
      pip3 install "${PIP_ARGS[@]}" --upgrade "tensorflow==${DV_TENSORFLOW_STANDARD_CPU_WHL_VERSION}"
#     fi
#   fi
# fi

# # A temporary fix.
# # Context: intel-tensorflow 2.7.0 will end up updating markupsafe to 2.1.1,
# # which caused the issue here: https://github.com/pallets/markupsafe/issues/286.
# # Specifically:
# # ImportError: cannot import name 'soft_unicode' from 'markupsafe'.
# # So, forcing a downgrade. This isn't the best solution, but we need it to get
# # our tests pass.
pip3 install "${PIP_ARGS[@]}" --upgrade 'markupsafe==2.0.1'

# ################################################################################
# # CUDA
# ################################################################################

# note_build_stage "Install CUDA"

# # See https://www.tensorflow.org/install/source#gpu for versions required.
# if [[ "${DV_GPU_BUILD}" = "1" ]]; then
#   if [[ "${DV_INSTALL_GPU_DRIVERS}" = "1" ]]; then
#     # This script is only maintained for Ubuntu 20.04.
#     UBUNTU_VERSION="2004"
#     # https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=20.04&target_type=deb_local
#     echo "Checking for CUDA..."
#     if ! dpkg-query -W cuda-11-3; then
#       echo "Installing CUDA..."
#       UBUNTU_VERSION="2004"
#       curl -O https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
#       sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
#       # From https://forums.developer.nvidia.com/t/notice-cuda-linux-repository-key-rotation/212772
#       sudo -H apt-key adv --fetch-keys "http://developer.download.nvidia.com/compute/cuda/repos/ubuntu${UBUNTU_VERSION}/x86_64/3bf863cc.pub"
#       sudo add-apt-repository -y "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
#       sudo -H apt-get update "${APT_ARGS[@]}"
#       # From: https://superuser.com/a/1638789
#       sudo -H DEBIAN_FRONTEND=noninteractive apt-get \
#         -o Dpkg::Options::=--force-confold \
#         -o Dpkg::Options::=--force-confdef \
#         -y --allow-downgrades --allow-remove-essential --allow-change-held-packages \
#         full-upgrade
#       sudo -H apt-get install "${APT_ARGS[@]}" cuda-11-3
#     fi
#     echo "Checking for CUDNN..."
#     if [[ ! -e /usr/local/cuda-11/include/cudnn.h ]]; then
#       echo "Installing CUDNN..."
#       CUDNN_TAR_FILE="cudnn-11.3-linux-x64-v8.2.0.53.tgz"
#       wget -q https://developer.download.nvidia.com/compute/redist/cudnn/v8.2.0/${CUDNN_TAR_FILE}
#       tar -xzvf ${CUDNN_TAR_FILE}
#       sudo cp -P cuda/include/cudnn.h /usr/local/cuda-11/include
#       sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda-11/lib64/
#       sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda-11/lib64/
#       sudo chmod a+r /usr/local/cuda-11/lib64/libcudnn*
#       sudo ldconfig
#     fi
#     # Tensorflow says to do this.
#     sudo -H apt-get install "${APT_ARGS[@]}" libcupti-dev > /dev/null
#   fi

#   # If we are doing a gpu-build, nvidia-smi should be install. Run it so we
#   # can see what gpu is installed.
#   nvidia-smi || :
# fi

# ################################################################################
# # TensorRT
# ################################################################################

# note_build_stage "Install TensorRT"

# # Address the issue:
# # 'dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory'
# # It's unclear whether we need this or not. Setting up to get rid of the errors.
# if [[ "${DV_GPU_BUILD}" = "1" ]]; then
#   pip3 install "${PIP_ARGS[@]}" nvidia-tensorrt
#   echo "For debugging:"
#   pip3 show nvidia-tensorrt
#   TENSORRT_PATH=$(python3 -c 'import tensorrt; print(tensorrt.__path__[0])')
#   sudo ln -sf "${TENSORRT_PATH}/libnvinfer.so.8" "${TENSORRT_PATH}/libnvinfer.so.7"
#   sudo ln -sf "${TENSORRT_PATH}/libnvinfer_plugin.so.8" "${TENSORRT_PATH}/libnvinfer_plugin.so.7"
#   export LD_LIBRARY_PATH="${LD_LIBRARY_PATH-}:${TENSORRT_PATH}"
#   sudo ldconfig
#   # Just in case this still doesn't work, we link them.
#   # This is a workaround that we might want to get rid of, if we can make sure
#   # setting LD_LIBRARY_PATH and `sudo ldconfig`` works.
#   if [[ ! -e /usr/local/nvidia/lib ]]; then
#     sudo mkdir -p /usr/local/nvidia/lib
#     sudo ln -sf "${TENSORRT_PATH}//libnvinfer.so.7" /usr/local/nvidia/lib/libnvinfer.so.7
#     sudo ln -sf "${TENSORRT_PATH}//libnvinfer_plugin.so.7" /usr/local/nvidia/lib/libnvinfer_plugin.so.7
#   fi
# fi

The output for ./run-prereq.sh:

========== This script is only maintained for Ubuntu 20.04.
========== Load config settings.
========== [Mon 05 Jun 2023 01:42:32 AM UTC] Stage 'Misc setup' starting
========== [Mon 05 Jun 2023 01:42:33 AM UTC] Stage 'Update package list' starting
========== [Mon 05 Jun 2023 01:42:34 AM UTC] Stage 'run-prereq.sh: Install development packages' starting
Calling wait_for_dpkg_lock.
========== [Mon 05 Jun 2023 01:42:35 AM UTC] Stage 'Install python3 packaging infrastructure' starting
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2518k  100 2518k    0     0  5403k      0 --:--:-- --:--:-- --:--:-- 5403k
Collecting pip
  Using cached pip-23.1.2-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 23.1.2
    Uninstalling pip-23.1.2:
      Successfully uninstalled pip-23.1.2
  WARNING: The scripts pip, pip3 and pip3.8 are installed in '/root/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pip-23.1.2
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Python 3.8.10
pip 23.1.2 from /root/.local/lib/python3.8/site-packages/pip (python 3.8)
========== [Mon 05 Jun 2023 01:42:38 AM UTC] Stage 'Install python3 packages' starting
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-cpu-aws 2.11.0 requires numpy>=1.20, but you have numpy 1.19.2 which is incompatible.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorboard 2.11.2 requires protobuf<4,>=3.9.2, but you have protobuf 4.23.2 which is incompatible.
tensorflow-cpu-aws 2.11.0 requires numpy>=1.20, but you have numpy 1.19.2 which is incompatible.
tensorflow-cpu-aws 2.11.0 requires protobuf<3.20,>=3.9.2, but you have protobuf 4.23.2 which is incompatible.
Installing standard CPU-only TensorFlow 2.11.0 wheel
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
werkzeug 2.3.4 requires MarkupSafe>=2.1.1, but you have markupsafe 2.0.1 which is incompatible.
========== [Mon 05 Jun 2023 01:42:50 AM UTC] Stage 'Install other packages' starting
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-api-core 2.11.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.13.0 which is incompatible.
googleapis-common-protos 1.59.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.13.0 which is incompatible.
========== [Mon 05 Jun 2023 01:42:51 AM UTC] Stage 'run-prereq.sh complete' starting

For the other set of commands:

> lsb_release -sc
focal
> wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key |  apt-key add - 
--2023-06-05 01:38:40--  https://apt.llvm.org/llvm-snapshot.gpg.key
Resolving apt.llvm.org (apt.llvm.org)... 146.75.46.49
Connecting to apt.llvm.org (apt.llvm.org)|146.75.46.49|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3145 (3.1K) [application/octet-stream]
Saving to: ‘STDOUT’

-                   100%[===================>]   3.07K  --.-KB/s    in 0s      

2023-06-05 01:38:40 (48.1 MB/s) - written to stdout [3145/3145]

OK

> add-apt-repository "deb http://apt.llvm.org/$(lsb_release -sc)/ llvm-toolchain-$(lsb_release -sc)-11 main"
Hit:1 https://download.docker.com/linux/ubuntu focal InRelease
Hit:2 https://download.sublimetext.com apt/stable/ InRelease
Hit:3 http://ports.ubuntu.com/ubuntu-ports focal InRelease
Hit:4 http://ports.ubuntu.com/ubuntu-ports focal-updates InRelease
Hit:5 http://ports.ubuntu.com/ubuntu-ports focal-backports InRelease
Hit:7 http://ports.ubuntu.com/ubuntu-ports focal-security InRelease
Hit:6 https://apt.llvm.org/focal llvm-toolchain-focal-11 InRelease
Reading package lists... Done                
root@m1ubuntu:/media/HostShared/deepvariant-r1.5# 
root@m1ubuntu:/media/HostShared/deepvariant-r1.5# sudo apt-get update
Hit:2 https://download.docker.com/linux/ubuntu focal InRelease                 
Hit:3 https://download.sublimetext.com apt/stable/ InRelease                   
Hit:4 http://ports.ubuntu.com/ubuntu-ports focal InRelease                     
Hit:1 https://apt.llvm.org/focal llvm-toolchain-focal-11 InRelease 
Hit:5 http://ports.ubuntu.com/ubuntu-ports focal-updates InRelease
Hit:6 http://ports.ubuntu.com/ubuntu-ports focal-backports InRelease
Hit:7 http://ports.ubuntu.com/ubuntu-ports focal-security InRelease
Reading package lists... Done

> sudo apt-get install -y llvm-11 llvm-11-dev clang-11 llvm-11-tools
Reading package lists... Done
Building dependency tree       
Reading state information... Done
clang-11 is already the newest version (1:11.0.0-2~ubuntu20.04.1).
llvm-11 is already the newest version (1:11.0.0-2~ubuntu20.04.1).
llvm-11-dev is already the newest version (1:11.0.0-2~ubuntu20.04.1).
llvm-11-tools is already the newest version (1:11.0.0-2~ubuntu20.04.1).
llvm-11-tools set to manually installed.
The following packages were automatically installed and are no longer required:
  docker-ce-rootless-extras slirp4netns
Use 'sudo apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
> sudo apt autoremove
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages will be REMOVED:
  docker-ce-rootless-extras slirp4netns
0 upgraded, 0 newly installed, 2 to remove and 0 not upgraded.
After this operation, 19.2 MB disk space will be freed.
Do you want to continue? [Y/n] Y
(Reading database ... 177786 files and directories currently installed.)
Removing docker-ce-rootless-extras (5:24.0.2-1~ubuntu.20.04~focal) ...
Removing slirp4netns (0.4.3-1) ...
Processing triggers for man-db (2.9.1-1) ...

Extra commands output:

> llvm-config-11 --version
11.0.0
> cat /usr/lib/llvm-11/lib/cmake/llvm/LLVMConfig.cmake | grep PACKAGE_VERSION
set(LLVM_PACKAGE_VERSION 11.0.0)
> cat /usr/lib/llvm-11/cmake/LLVMConfig.cmake | grep PACKAGE_VERSION
set(LLVM_PACKAGE_VERSION 11.0.0)
> cat /lib/llvm-11/cmake/LLVMConfig.cmake | grep PACKAGE_VERSION
set(LLVM_PACKAGE_VERSION 11.0.0)

Finally the output from sudo tools/build_clif.sh installation of CLIF (still the same error I guess):

-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
CMake Error at clif/cmake/modules/CLIFUtils.cmake:37 (find_package):
  Could not find a configuration file for package "LLVM" that is compatible
  with requested version "11.1.0".

  The following configuration files were considered but not accepted:

    /usr/lib/llvm-11/lib/cmake/llvm/LLVMConfig.cmake, version: 11.0.0
    /usr/lib/llvm-11/cmake/LLVMConfig.cmake, version: 11.0.0
    /lib/llvm-11/cmake/LLVMConfig.cmake, version: 11.0.0

Call Stack (most recent call first):
  clif/CMakeLists.txt:22 (include)

-- Configuring incomplete, errors occurred!
See also "/root/clif/build/CMakeFiles/CMakeOutput.log".
See also "/root/clif/build/CMakeFiles/CMakeError.log".

Do you have any suggestions?

pgrosu commented 1 year ago

So in tools/build_build.clif script, add the following on the line right above ./INSTALL.sh:

sed -i -e 's/LLVM 11.1.0/LLVM 11.0.0/g' clif/cmake/modules/CLIFUtils.cmake

It should look like this:

if [[ ! -z ${CLIF_PIN} ]]; then
  git checkout "${CLIF_PIN}"
fi
sed -i -e 's/LLVM 11.1.0/LLVM 11.0.0/g' clif/cmake/modules/CLIFUtils.cmake
./INSTALL.sh

Then try it running tools/build_build.clif again.

heznanda commented 1 year ago

@pgrosu The tools/build_clif.sh was successful! Thank you!

After that I tried running this:

> sudo docker run google/deepvariant:"${BIN_VERSION}"
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
exec /opt/deepvariant/bin/run_deepvariant: exec format error
> sudo docker run --platform linux/amd64 google/deepvariant:"${BIN_VERSION}"
exec /opt/deepvariant/bin/run_deepvariant: exec format error

Sorry, but what should I run before this? Should I run the ./build-prereq.sh and ./build_and_test.sh? Should I uncomment the ./run-prereq.sh sections as well?

Actually, I am currently running the ./build-prereq.sh:

========== This script is only maintained for Ubuntu 20.04.
========== Load config settings.
========== [Mon 05 Jun 2023 03:51:21 PM UTC] Stage 'Install the runtime packages' starting
========== This script is only maintained for Ubuntu 20.04.
========== Load config settings.
========== [Mon 05 Jun 2023 03:51:21 PM UTC] Stage 'Misc setup' starting
========== [Mon 05 Jun 2023 03:51:23 PM UTC] Stage 'Update package list' starting
========== [Mon 05 Jun 2023 03:51:24 PM UTC] Stage 'run-prereq.sh: Install development packages' starting
Calling wait_for_dpkg_lock.
========== [Mon 05 Jun 2023 03:51:24 PM UTC] Stage 'Install python3 packaging infrastructure' starting
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2518k  100 2518k    0     0  8653k      0 --:--:-- --:--:-- --:--:-- 8623k
Collecting pip
  Using cached pip-23.1.2-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 23.1.2
    Uninstalling pip-23.1.2:
      Successfully uninstalled pip-23.1.2
  WARNING: The scripts pip, pip3 and pip3.8 are installed in '/root/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pip-23.1.2
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Python 3.8.10
pip 23.1.2 from /root/.local/lib/python3.8/site-packages/pip (python 3.8)
========== [Mon 05 Jun 2023 03:51:27 PM UTC] Stage 'Install python3 packages' starting
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-cpu-aws 2.11.0 requires numpy>=1.20, but you have numpy 1.19.2 which is incompatible.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
pyclif 0.4 requires pyparsing==2.2.0, but you have pyparsing 3.0.9 which is incompatible.
tensorflow-cpu-aws 2.11.0 requires numpy>=1.20, but you have numpy 1.19.2 which is incompatible.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
pyclif 0.4 requires pyparsing==2.2.0, but you have pyparsing 3.0.9 which is incompatible.
tensorboard 2.11.2 requires protobuf<4,>=3.9.2, but you have protobuf 4.23.2 which is incompatible.
tensorflow-cpu-aws 2.11.0 requires numpy>=1.20, but you have numpy 1.19.2 which is incompatible.
tensorflow-cpu-aws 2.11.0 requires protobuf<3.20,>=3.9.2, but you have protobuf 4.23.2 which is incompatible.
Installing standard CPU-only TensorFlow 2.11.0 wheel
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
pyclif 0.4 requires pyparsing==2.2.0, but you have pyparsing 3.0.9 which is incompatible.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
werkzeug 2.3.4 requires MarkupSafe>=2.1.1, but you have markupsafe 2.0.1 which is incompatible.
========== [Mon 05 Jun 2023 03:51:39 PM UTC] Stage 'Install other packages' starting
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-api-core 2.11.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.13.0 which is incompatible.
googleapis-common-protos 1.59.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.13.0 which is incompatible.
pyclif 0.4 requires pyparsing==2.2.0, but you have pyparsing 3.0.9 which is incompatible.
========== [Mon 05 Jun 2023 03:51:40 PM UTC] Stage 'run-prereq.sh complete' starting
========== [Mon 05 Jun 2023 03:51:40 PM UTC] Stage 'Update package list' starting
========== [Mon 05 Jun 2023 03:51:41 PM UTC] Stage 'build-prereq.sh: Install development packages' starting
Calling wait_for_dpkg_lock.
========== [Mon 05 Jun 2023 03:51:42 PM UTC] Stage 'Install bazel' starting
/root/bin/bazel: line 220: /root/.bazel/bin/bazel-real: cannot execute binary file: Exec format error
/root/bin/bazel: line 220: /root/.bazel/bin/bazel-real: Success
~/bazel /media/HostShared/deepvariant-r1.5
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 46.5M  100 46.5M    0     0  23.2M      0  0:00:02  0:00:02 --:--:-- 27.1M
/media/HostShared/deepvariant-r1.5
========== [Mon 05 Jun 2023 03:51:44 PM UTC] Stage 'Install CLIF binary' starting
CLIF already installed.
========== [Mon 05 Jun 2023 03:51:44 PM UTC] Stage 'Download and configure TensorFlow sources' starting
========== [Mon 05 Jun 2023 03:51:44 PM UTC] Stage 'Cloning TensorFlow from github as ../tensorflow doesn't exist' starting
Cloning into 'tensorflow'...
remote: Enumerating objects: 1585302, done.
remote: Counting objects: 100% (346968/346968), done.
remote: Compressing objects: 100% (5367/5367), done.
remote: Total 1585302 (delta 342939), reused 342327 (delta 341589), pack-reused 1238334
Receiving objects: 100% (1585302/1585302), 920.91 MiB | 18.57 MiB/s, done.
Resolving deltas: 100% (1307043/1307043), done.
Updating files: 100% (29800/29800), done.
Updating files: 100% (12761/12761), done.
Note: switching to 'v2.11.0'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at d5b57ca93e5 Merge pull request #58598 from tensorflow/vinila21-patch-1
WARNING: current bazel installation is not a release version.
Do you wish to build TensorFlow with ROCm support? [y/N]: No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: No CUDA support will be enabled for TensorFlow.

Do you wish to download a fresh release of clang? (Experimental) [y/N]: Clang will not be downloaded.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -Wno-sign-compare]: 

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
    --config=mkl            # Build with MKL support.
    --config=mkl_aarch64    # Build with oneDNN and Compute Library for the Arm Architecture (ACL).
    --config=monolithic     # Config for mostly static monolithic build.
    --config=numa           # Build with NUMA support.
    --config=dynamic_kernels    # (Experimental) Build kernels into separate shared objects.
    --config=v1             # Build with TensorFlow 1 API instead of TF 2 API.
Preconfigured Bazel build configs to DISABLE default on features:
    --config=nogcp          # Disable GCP support.
    --config=nonccl         # Disable NVIDIA NCCL support.
Configuration finished
========== [Mon 05 Jun 2023 04:03:16 PM UTC] Stage 'Set pyparsing to 2.2.0 for CLIF.' starting
Found existing installation: pyparsing 3.0.9
Uninstalling pyparsing-3.0.9:
  Successfully uninstalled pyparsing-3.0.9
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Using pip 23.1.2 from /root/.local/lib/python3.8/site-packages/pip (python 3.8)
Collecting pyparsing==2.2.0
  Using cached pyparsing-2.2.0-py2.py3-none-any.whl (56 kB)
Installing collected packages: pyparsing
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
httplib2 0.22.0 requires pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2; python_version > "3.0", but you have pyparsing 2.2.0 which is incompatible.
Successfully installed pyparsing-2.2.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
========== [Mon 05 Jun 2023 04:03:17 PM UTC] Stage 'Set pyparsing to 2.2.0 for CLIF.' starting
Found existing installation: pyparsing 2.2.0
Uninstalling pyparsing-2.2.0:
  Successfully uninstalled pyparsing-2.2.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Using pip 23.1.2 from /root/.local/lib/python3.8/site-packages/pip (python 3.8)
Collecting pyparsing==2.2.0
  Using cached pyparsing-2.2.0-py2.py3-none-any.whl (56 kB)
Installing collected packages: pyparsing
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
httplib2 0.22.0 requires pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2; python_version > "3.0", but you have pyparsing 2.2.0 which is incompatible.
Successfully installed pyparsing-2.2.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
========== [Mon 05 Jun 2023 04:03:18 PM UTC] Stage 'build-prereq.sh complete' starting

The running of ./build_and_test.sh:

+ source settings.sh
++ export DV_USE_PREINSTALLED_TF=0
++ DV_USE_PREINSTALLED_TF=0
++ export TF_NEED_GCP=1
++ TF_NEED_GCP=1
++ export CUDNN_INSTALL_PATH=/usr/lib/aarch64-linux-gnu
++ CUDNN_INSTALL_PATH=/usr/lib/aarch64-linux-gnu
++ DV_BAZEL_VERSION=5.3.0
++ export PATH=/root/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
++ PATH=/root/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
++ export DEEPVARIANT_BUCKET=gs://deepvariant
++ DEEPVARIANT_BUCKET=gs://deepvariant
++ export DV_PACKAGE_BUCKET_PATH=gs://deepvariant/packages
++ DV_PACKAGE_BUCKET_PATH=gs://deepvariant/packages
++ export DV_PACKAGE_CURL_PATH=https://storage.googleapis.com/deepvariant/packages
++ DV_PACKAGE_CURL_PATH=https://storage.googleapis.com/deepvariant/packages
++ export DV_TF_NIGHTLY_BUILD=0
++ DV_TF_NIGHTLY_BUILD=0
++ [[ 0 = \1 ]]
++ export DV_CPP_TENSORFLOW_TAG=v2.11.0
++ DV_CPP_TENSORFLOW_TAG=v2.11.0
++ export DV_GCP_OPTIMIZED_TF_WHL_VERSION=2.11.0
++ DV_GCP_OPTIMIZED_TF_WHL_VERSION=2.11.0
++ export DV_TENSORFLOW_STANDARD_GPU_WHL_VERSION=2.11.0
++ DV_TENSORFLOW_STANDARD_GPU_WHL_VERSION=2.11.0
++ export DV_TENSORFLOW_STANDARD_CPU_WHL_VERSION=2.11.0
++ DV_TENSORFLOW_STANDARD_CPU_WHL_VERSION=2.11.0
++ export DV_GPU_BUILD=0
++ DV_GPU_BUILD=0
++ export DV_USE_GCP_OPTIMIZED_TF_WHL=0
++ DV_USE_GCP_OPTIMIZED_TF_WHL=0
++ export GCP_OPTIMIZED_TF_WHL_FILENAME=tensorflow-2.11.0.deepvariant_gcp-cp27-none-linux_x86_64.whl
++ GCP_OPTIMIZED_TF_WHL_FILENAME=tensorflow-2.11.0.deepvariant_gcp-cp27-none-linux_x86_64.whl
++ export GCP_OPTIMIZED_TF_WHL_PATH=gs://deepvariant/packages/tensorflow
++ GCP_OPTIMIZED_TF_WHL_PATH=gs://deepvariant/packages/tensorflow
++ export GCP_OPTIMIZED_TF_WHL_CURL_PATH=https://storage.googleapis.com/deepvariant/packages/tensorflow
++ GCP_OPTIMIZED_TF_WHL_CURL_PATH=https://storage.googleapis.com/deepvariant/packages/tensorflow
++ export DV_TF_NUMPY_VERSION=1.19.2
++ DV_TF_NUMPY_VERSION=1.19.2
++ export DV_INSTALL_GPU_DRIVERS=0
++ DV_INSTALL_GPU_DRIVERS=0
++ export PYTHON_VERSION=3.8
++ PYTHON_VERSION=3.8
+++ which python3.8
++ export PYTHON_BIN_PATH=/usr/bin/python3.8
++ PYTHON_BIN_PATH=/usr/bin/python3.8
++ export PYTHON_LIB_PATH=/usr/local/lib/python3.8/dist-packages
++ PYTHON_LIB_PATH=/usr/local/lib/python3.8/dist-packages
++ export USE_DEFAULT_PYTHON_LIB_PATH=1
++ USE_DEFAULT_PYTHON_LIB_PATH=1
++ export 'DV_COPT_FLAGS=--copt=-march=corei7 --copt=-Wno-sign-compare --copt=-Wno-write-strings --experimental_build_setting_api'
++ DV_COPT_FLAGS='--copt=-march=corei7 --copt=-Wno-sign-compare --copt=-Wno-write-strings --experimental_build_setting_api'
+ bazel
/root/bin/bazel: line 220: /root/.bazel/bin/bazel-real: cannot execute binary file: Exec format error
/root/bin/bazel: line 220: /root/.bazel/bin/bazel-real: Success
+ PATH=/root/bin:/root/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
+ [[ 0 = \1 ]]
+ bazel test -c opt --copt=-march=corei7 --copt=-Wno-sign-compare --copt=-Wno-write-strings --experimental_build_setting_api deepvariant/...
/root/bin/bazel: line 220: /root/.bazel/bin/bazel-real: cannot execute binary file: Exec format error
/root/bin/bazel: line 220: /root/.bazel/bin/bazel-real: Success
pgrosu commented 1 year ago

That's good, but let's go a bit slower just to be sure each individual component is working properly. I'm not sure Bazel is working properly, so let's try the following steps:

1) Complete these last steps of clif:

  sudo mkdir -p /usr/clang/bin/
  sudo ln -sf /usr/local/bin/clif-matcher /usr/clang/bin/clif-matcher
  sudo mkdir -p /usr/local/clif/bin
  sudo ln -sf /usr/local/bin/pyclif* /usr/local/clif/bin/
  DIST_PACKAGES_DIR=$(python3 -c "import site; print(site.getsitepackages()[0])")
  sudo ln -sf "${DIST_PACKAGES_DIR}"/clif/python /usr/local/clif/

2) Let's troubleshoot bazel, as bazel is also a bit tricky to install. First do the following:

sudo mv /root/.bazel /root/.bazel-orig sudo mv /root/bin/bazel /root/bin/bazel-orig

Could you try the following steps and let me know what you see -- it would be nice to run as sudo and not as root directly:

rm .bazelrc
curl -L -O https://github.com/bazelbuild/bazel/releases/download/5.3.0/bazel-5.3.0-installer-linux-x86_64.sh
chmod +x bazel-*.sh
./bazel-5.3.0-installer-linux-x86_64.sh --user > /dev/null

When you run it and launch it, it should look something like this:

$ ./bazel-5.3.0-installer-linux-x86_64.sh --user > /dev/null
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
$ bazel
                                                           [bazel release 5.3.0]
Usage: bazel <command> <options> ...

Available commands:
  analyze-profile     Analyzes build profile data.
  aquery              Analyzes the given targets and queries the action graph.
  build               Builds the specified targets.
  canonicalize-flags  Canonicalizes a list of bazel options.
  clean               Removes output files and optionally stops the server.
  coverage            Generates code coverage report for specified test targets.
  cquery              Loads, analyzes, and queries the specified targets w/ configurations.
  ...

Thanks, ~p

heznanda commented 1 year ago

@pgrosu Thank you for the reply! Using the commands:

> rm .bazelrc
rm: cannot remove '.bazelrc': No such file or directory
> curl -L -O https://github.com/bazelbuild/bazel/releases/download/5.3.0/bazel-5.3.0-installer-linux-x86_64.sh
> chmod +x bazel-*.sh
chmod: changing permissions of 'bazel-5.3.0-installer-linux-x86_64.sh': Operation not permitted
> ./bazel-5.3.0-installer-linux-x86_64.sh --user > /dev/null
/home/user/bin/bazel: line 220: /home/user/.bazel/bin/bazel-real: cannot execute binary file: Exec format error
/home/user/bin/bazel: line 220: /home/user/.bazel/bin/bazel-real: Success
> sudo su
> rm .bazelrc
rm: cannot remove '.bazelrc': No such file or directory
> chmod +x bazel-*.sh
> ./bazel-5.3.0-installer-linux-x86_64.sh --user > /dev/null
> ./root/.bazel/bin/bazel
/root/.bazel/bin/bazel: line 220: /root/.bazel/bin/bazel-real: cannot execute binary file: Exec format error
/root/.bazel/bin/bazel: line 220: /root/.bazel/bin/bazel-real: Success

so ./bazel-5.3.0-installer-linux-x86_64.sh --user > /dev/null in root mode did not have any response. But it seems that bazel is not installed properly.

pgrosu commented 1 year ago

I think it's because it's a Mac M1 which is aarch64 (arm64), so let's try the following one:

curl -L -O https://github.com/bazelbuild/bazel/releases/download/5.3.0/bazel-5.3.0-linux-arm64

Let me know if it runs for you when you execute it via the following:

chmod +x bazel-5.3.0-linux-arm64
./bazel-5.3.0-linux-arm64

Let me know if that fixes it.

Thanks, ~p

heznanda commented 1 year ago

@pgrosu Wow! now I think it fixes it:

> ./bazel-5.3.0-linux-arm64 
WARNING: Invoking Bazel in batch mode since it is not invoked from within a workspace (below a directory having a WORKSPACE file).
Extracting Bazel installation...
                                                           [bazel release 5.3.0]
Usage: bazel <command> <options> ...

Available commands:
  analyze-profile     Analyzes build profile data.
  aquery              Analyzes the given targets and queries the action graph.
  build               Builds the specified targets.
  canonicalize-flags  Canonicalizes a list of bazel options.
  clean               Removes output files and optionally stops the server.
  coverage            Generates code coverage report for specified test targets.
  cquery              Loads, analyzes, and queries the specified targets w/ configurations.
  dump                Dumps the internal state of the bazel server process.
  fetch               Fetches external repositories that are prerequisites to the targets.
  help                Prints help for commands, or the index.
  info                Displays runtime info about the bazel server.
  license             Prints the license of this software.
  mobile-install      Installs targets to mobile devices.
  print_action        Prints the command line args for compiling a file.
  query               Executes a dependency graph query.
  run                 Runs the specified target.
  shutdown            Stops the bazel server.
  sync                Syncs all repositories specified in the workspace file
  test                Builds and runs the specified test targets.
  version             Prints version information for bazel.

Getting more help:
  bazel help <command>
                   Prints help and options for <command>.
  bazel help startup_options
                   Options for the JVM hosting bazel.
  bazel help target-syntax
                   Explains the syntax for specifying targets.
  bazel help info-keys
                   Displays a list of keys used by the info command.

But the bazel command still returns:

/root/.bazel/bin/bazel: line 220: /root/.bazel/bin/bazel-real: cannot execute binary file: Exec format error
/root/.bazel/bin/bazel: line 220: /root/.bazel/bin/bazel-real: Success
pgrosu commented 1 year ago

Good! Now try the following:

sudo mv /root/.bazel /root/.bazel-x86
sudo cp ./bazel-5.3.0-linux-arm64 /usr/bin
sudo ln -s /usr/bin/bazel-5.3.0-linux-arm64 /usr/bin/bazel

Then try to launch it just with typing bazel and see if it works, since all the scripts use bazel as their command and the above just creates a symbolic link to the executable (thus not taking any significant space).

heznanda commented 1 year ago

@pgrosu Yes!! thank you! For a second, I thought that bazel-5.3.0-linux-arm64 is a folder. but it is an actual bazel bin.

> bazel
WARNING: Invoking Bazel in batch mode since it is not invoked from within a workspace (below a directory having a WORKSPACE file).
                                                           [bazel release 5.3.0]
Usage: bazel <command> <options> ...

Available commands:
  analyze-profile     Analyzes build profile data.
  aquery              Analyzes the given targets and queries the action graph.
  build               Builds the specified targets.
  canonicalize-flags  Canonicalizes a list of bazel options.
  clean               Removes output files and optionally stops the server.
  coverage            Generates code coverage report for specified test targets.
  cquery              Loads, analyzes, and queries the specified targets w/ configurations.
  dump                Dumps the internal state of the bazel server process.
  fetch               Fetches external repositories that are prerequisites to the targets.
  help                Prints help for commands, or the index.
  info                Displays runtime info about the bazel server.
  license             Prints the license of this software.
  mobile-install      Installs targets to mobile devices.
  print_action        Prints the command line args for compiling a file.
  query               Executes a dependency graph query.
  run                 Runs the specified target.
  shutdown            Stops the bazel server.
  sync                Syncs all repositories specified in the workspace file
  test                Builds and runs the specified test targets.
  version             Prints version information for bazel.

Getting more help:
  bazel help <command>
                   Prints help and options for <command>.
  bazel help startup_options
                   Options for the JVM hosting bazel.
  bazel help target-syntax
                   Explains the syntax for specifying targets.
  bazel help info-keys
                   Displays a list of keys used by the info command.

What should we do next?

> ./build-prereq.sh
========== This script is only maintained for Ubuntu 20.04.
========== Load config settings.
========== [Mon 05 Jun 2023 10:22:12 PM EDT] Stage 'Install the runtime packages' starting
========== This script is only maintained for Ubuntu 20.04.
========== Load config settings.
========== [Mon 05 Jun 2023 10:22:12 PM EDT] Stage 'Misc setup' starting
========== [Mon 05 Jun 2023 10:22:13 PM EDT] Stage 'Update package list' starting
========== [Mon 05 Jun 2023 10:22:14 PM EDT] Stage 'run-prereq.sh: Install development packages' starting
Calling wait_for_dpkg_lock.
========== [Mon 05 Jun 2023 10:22:15 PM EDT] Stage 'Install python3 packaging infrastructure' starting
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2518k  100 2518k    0     0  5403k      0 --:--:-- --:--:-- --:--:-- 5392k
Collecting pip
  Using cached pip-23.1.2-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 23.1.2
    Uninstalling pip-23.1.2:
      Successfully uninstalled pip-23.1.2
  WARNING: The scripts pip, pip3 and pip3.8 are installed in '/root/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pip-23.1.2
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Python 3.8.10
pip 23.1.2 from /root/.local/lib/python3.8/site-packages/pip (python 3.8)
========== [Mon 05 Jun 2023 10:22:17 PM EDT] Stage 'Install python3 packages' starting
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-cpu-aws 2.11.0 requires numpy>=1.20, but you have numpy 1.19.2 which is incompatible.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
pyclif 0.4 requires pyparsing==2.2.0, but you have pyparsing 3.0.9 which is incompatible.
tensorflow-cpu-aws 2.11.0 requires numpy>=1.20, but you have numpy 1.19.2 which is incompatible.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
pyclif 0.4 requires pyparsing==2.2.0, but you have pyparsing 3.0.9 which is incompatible.
tensorboard 2.11.2 requires protobuf<4,>=3.9.2, but you have protobuf 4.23.2 which is incompatible.
tensorflow-cpu-aws 2.11.0 requires numpy>=1.20, but you have numpy 1.19.2 which is incompatible.
tensorflow-cpu-aws 2.11.0 requires protobuf<3.20,>=3.9.2, but you have protobuf 4.23.2 which is incompatible.
Installing standard CPU-only TensorFlow 2.11.0 wheel
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
pyclif 0.4 requires pyparsing==2.2.0, but you have pyparsing 3.0.9 which is incompatible.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
werkzeug 2.3.4 requires MarkupSafe>=2.1.1, but you have markupsafe 2.0.1 which is incompatible.
========== [Mon 05 Jun 2023 10:22:30 PM EDT] Stage 'Install other packages' starting
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-api-core 2.11.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.13.0 which is incompatible.
googleapis-common-protos 1.59.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.13.0 which is incompatible.
pyclif 0.4 requires pyparsing==2.2.0, but you have pyparsing 3.0.9 which is incompatible.
========== [Mon 05 Jun 2023 10:22:31 PM EDT] Stage 'run-prereq.sh complete' starting
========== [Mon 05 Jun 2023 10:22:31 PM EDT] Stage 'Update package list' starting
========== [Mon 05 Jun 2023 10:22:32 PM EDT] Stage 'build-prereq.sh: Install development packages' starting
Calling wait_for_dpkg_lock.
========== [Mon 05 Jun 2023 10:22:32 PM EDT] Stage 'Install bazel' starting
WARNING: Value of --bazelrc is ignored, since --ignore_all_rc_files is on.
Bazel 5.3.0 already installed on the machine, not reinstalling
========== [Mon 05 Jun 2023 10:22:32 PM EDT] Stage 'Install CLIF binary' starting
CLIF already installed.
========== [Mon 05 Jun 2023 10:22:32 PM EDT] Stage 'Download and configure TensorFlow sources' starting
HEAD is now at d5b57ca93e5 Merge pull request #58598 from tensorflow/vinila21-patch-1
You have bazel 5.3.0 installed.
Do you wish to build TensorFlow with ROCm support? [y/N]: No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: No CUDA support will be enabled for TensorFlow.

Do you wish to download a fresh release of clang? (Experimental) [y/N]: Clang will not be downloaded.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -Wno-sign-compare]: 

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
    --config=mkl            # Build with MKL support.
    --config=mkl_aarch64    # Build with oneDNN and Compute Library for the Arm Architecture (ACL).
    --config=monolithic     # Config for mostly static monolithic build.
    --config=numa           # Build with NUMA support.
    --config=dynamic_kernels    # (Experimental) Build kernels into separate shared objects.
    --config=v1             # Build with TensorFlow 1 API instead of TF 2 API.
Preconfigured Bazel build configs to DISABLE default on features:
    --config=nogcp          # Disable GCP support.
    --config=nonccl         # Disable NVIDIA NCCL support.
Configuration finished
========== [Mon 05 Jun 2023 10:22:33 PM EDT] Stage 'Set pyparsing to 2.2.0 for CLIF.' starting
Found existing installation: pyparsing 3.0.9
Uninstalling pyparsing-3.0.9:
  Successfully uninstalled pyparsing-3.0.9
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Using pip 23.1.2 from /root/.local/lib/python3.8/site-packages/pip (python 3.8)
Collecting pyparsing==2.2.0
  Using cached pyparsing-2.2.0-py2.py3-none-any.whl (56 kB)
Installing collected packages: pyparsing
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
httplib2 0.22.0 requires pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2; python_version > "3.0", but you have pyparsing 2.2.0 which is incompatible.
Successfully installed pyparsing-2.2.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
========== [Mon 05 Jun 2023 10:22:33 PM EDT] Stage 'Set pyparsing to 2.2.0 for CLIF.' starting
Found existing installation: pyparsing 2.2.0
Uninstalling pyparsing-2.2.0:
  Successfully uninstalled pyparsing-2.2.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Using pip 23.1.2 from /root/.local/lib/python3.8/site-packages/pip (python 3.8)
Collecting pyparsing==2.2.0
  Using cached pyparsing-2.2.0-py2.py3-none-any.whl (56 kB)
Installing collected packages: pyparsing
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
httplib2 0.22.0 requires pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2; python_version > "3.0", but you have pyparsing 2.2.0 which is incompatible.
Successfully installed pyparsing-2.2.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
========== [Mon 05 Jun 2023 10:22:34 PM EDT] Stage 'build-prereq.sh complete' starting

> docker run --platform linux/amd64 google/deepvariant:1.5.0
exec /opt/deepvariant/bin/run_deepvariant: exec format error
heznanda commented 1 year ago

I am tempted to fix all the errors by uninstalling the python packages and reinstalling based on the required package version.

pgrosu commented 1 year ago

Let me think a bit before we take the next step, as we are in a good place now. Also, since we're building DeepVariant from scratch, then Docker will not be necessary after all is done as we'll be using DeepVariant directly, and it will be much smaller and faster. Docker basically wraps it's own version of DeepVariant, and it has many layers of translation, which would not be as nimble as this approach.

pgrosu commented 1 year ago

So from what I see you've completed the following steps:

1) Built and installed CLIF 2) Installed Bazel 3) Installed the Tensorflow Python Module 4) Downloaded and configured Tensorflow from GitHub (we might need to checkout the proper version)

The final step is to build the DeepVariant binaries. Just a note, you will require several GB of space for this to work. The DeepVariant zip files we eventually get will be quite small. (The zip files are basically Python scripts and bytecode with a compiled shared library.)

Regarding what's missing or incompatible during the build, we will let the compiler tell us that, which will make it a bit easier to troubleshoot.

Just to be sure we minimize surprises, we will do a couple of things first. To ensure the TensorFlow code we will compile against is of the same version as the TensorFlow python package (2.11.0), run the following in your git clone TensorFlow folder (the assumption is that your TensorFlow folder is outside your DeepVariant one):

# Assuming you are in the DeepVariant directory

cd ../tensorflow
git checkout origin/r2.11 -f

# Here configure like before with the defaults
./configure

cd ../deepvariant

Once in the DeepVariant directory, perform each of these separately so we can isolate any issues:

#!/bin/bash
source settings.sh

wget https://raw.githubusercontent.com/google/deepvariant/r1.5/.bazelrc

bazel build -c opt ${DV_COPT_FLAGS} --build_python_zip :binaries

You may notice that we removed .bazelrc previously, that was so that we can ensure while we troubleshoot the bazel installation nothing gets triggered by the .bazelrc config.

The last two things you have to run to complete the DeepVariant build are the following, though the above should give most of what you need:

#!/bin/bash
source settings.sh

bazel build -c opt ${DV_COPT_FLAGS} --build_python_zip //deepvariant/labeler:labeled_examples_to_vcf
#!/bin/bash
source settings.sh

bazel build -c opt ${DV_COPT_FLAGS} --build_python_zip :binaries-deeptrio

Your zip files will be under the bazel-bin/deepvariant/. The zip files will require some patching, which I can show you once you have a successful built of the above. The output of a successful build should look something like this:

  ...
  bazel-bin/deepvariant/postprocess_variants
  bazel-bin/deepvariant/postprocess_variants.zip
  bazel-bin/deepvariant/runtime_by_region_vis
  bazel-bin/deepvariant/runtime_by_region_vis.zip
  bazel-bin/deepvariant/show_examples
  bazel-bin/deepvariant/show_examples.zip
  bazel-bin/deepvariant/vcf_stats_report
  bazel-bin/deepvariant/vcf_stats_report.zip
(09:13:54) INFO: Elapsed time: 227.413s, Critical Path: 215.22s
(09:13:54) INFO: 47 processes: 1 internal, 46 local.
(09:13:54) INFO: Build completed successfully, 47 total actions
$

The zip files will be under the bazel-bin/deepvariant/ folder, which will look something like this:

$ ls bazel-bin/deepvariant/ 
call_variants
call_variants_keras
call_variants_keras.temp
call_variants_keras.zip
call_variants_keras.zip-0.params
call_variants.temp
call_variants.zip
call_variants.zip-0.params
freeze_graph
freeze_graph.temp
...

Let me know how it goes.

Thanks, ~p

heznanda commented 1 year ago

@pgrosu I got tensorflow on a different folder than deepvariant-r1.5 folder. Just fyi, there are multiple installation of bazel: /home/user/.bazel/bin and /usr/bin/ The one that is working is /usr/bin/bazel.

The command /usr/bin/bazel build -c opt ${DV_COPT_FLAGS} --build_python_zip :binaries inside the deepvariant-r1.5 folder gives another error:

ERROR: An error occurred during the fetch of repository 'local_config_python':
   Traceback (most recent call last):
    File "/home/user/.cache/bazel/_bazel_user/7cc1383e03390275c978033f9adc7c2a/external/org_tensorflow/third_party/py/python_configure.bzl", line 271, column 40, in _python_autoconf_impl
        _create_local_python_repository(repository_ctx)
    File "/home/user/.cache/bazel/_bazel_user/7cc1383e03390275c978033f9adc7c2a/external/org_tensorflow/third_party/py/python_configure.bzl", line 214, column 22, in _create_local_python_repository
        _check_python_lib(repository_ctx, python_lib)
    File "/home/user/.cache/bazel/_bazel_user/7cc1383e03390275c978033f9adc7c2a/external/org_tensorflow/third_party/py/python_configure.bzl", line 138, column 25, in _check_python_lib
        auto_config_fail("Invalid python library path: %s" % python_lib)
    File "/home/user/.cache/bazel/_bazel_user/7cc1383e03390275c978033f9adc7c2a/external/org_tensorflow/third_party/remote_config/common.bzl", line 12, column 9, in auto_config_fail
        fail("%sConfiguration Error:%s %s\n" % (red, no_color, msg))
Error in fail: Configuration Error: Invalid python library path: "
(11:50:11) ERROR: /home/user/Documents/deepvariant-r1.5/WORKSPACE:108:14: fetching python_configure rule //external:local_config_python: Traceback (most recent call last):
    File "/home/user/.cache/bazel/_bazel_user/7cc1383e03390275c978033f9adc7c2a/external/org_tensorflow/third_party/py/python_configure.bzl", line 271, column 40, in _python_autoconf_impl
        _create_local_python_repository(repository_ctx)
    File "/home/user/.cache/bazel/_bazel_user/7cc1383e03390275c978033f9adc7c2a/external/org_tensorflow/third_party/py/python_configure.bzl", line 214, column 22, in _create_local_python_repository
        _check_python_lib(repository_ctx, python_lib)
    File "/home/user/.cache/bazel/_bazel_user/7cc1383e03390275c978033f9adc7c2a/external/org_tensorflow/third_party/py/python_configure.bzl", line 138, column 25, in _check_python_lib
        auto_config_fail("Invalid python library path: %s" % python_lib)
    File "/home/user/.cache/bazel/_bazel_user/7cc1383e03390275c978033f9adc7c2a/external/org_tensorflow/third_party/remote_config/common.bzl", line 12, column 9, in auto_config_fail
        fail("%sConfiguration Error:%s %s\n" % (red, no_color, msg))
Error in fail: Configuration Error: Invalid python library path: "
(11:50:12) INFO: Repository go_sdk instantiated at:
  /home/user/Documents/deepvariant-r1.5/WORKSPACE:116:14: in <toplevel>
  /home/user/.cache/bazel/_bazel_user/7cc1383e03390275c978033f9adc7c2a/external/org_tensorflow/tensorflow/workspace0.bzl:134:20: in workspace
  /home/user/.cache/bazel/_bazel_user/7cc1383e03390275c978033f9adc7c2a/external/com_github_grpc_grpc/bazel/grpc_extra_deps.bzl:36:27: in grpc_extra_deps
  /home/user/.cache/bazel/_bazel_user/7cc1383e03390275c978033f9adc7c2a/external/io_bazel_rules_go/go/private/sdk.bzl:431:28: in go_register_toolchains
  /home/user/.cache/bazel/_bazel_user/7cc1383e03390275c978033f9adc7c2a/external/io_bazel_rules_go/go/private/sdk.bzl:130:21: in go_download_sdk
Repository rule _go_download_sdk defined at:
  /home/user/.cache/bazel/_bazel_user/7cc1383e03390275c978033f9adc7c2a/external/io_bazel_rules_go/go/private/sdk.bzl:117:35: in <toplevel>
(11:50:12) ERROR: Analysis of target '//:binaries' failed; build aborted: Configuration Error: Invalid python library path: "
(11:50:12) INFO: Elapsed time: 0.552s
(11:50:12) INFO: 0 processes.
(11:50:12) FAILED: Build did NOT complete successfully (0 packages loaded, 0 t\
argets configured)

Do you have any suggestions? Thank you for your time! Looks like we are getting there for this one.

Just fyi, fortunately, I was able to run deepvariant on another unix system using singularity pull. After updating the numpy and implementing the GPU command -nv, it runs pretty fast and error-free.

pgrosu commented 1 year ago

@heznanda Glad to hear you got it working! Regarding the fix for this -- only if you are curious :) -- under the TensorFlow folder you should have a file called .tf_configure.bazelrc. That file should look like this (assuming your Python3 libs reside under /usr/lib/python3/dist-packages):

build --action_env PYTHON_BIN_PATH="/usr/local/bin/python3"
build --action_env PYTHON_LIB_PATH="/usr/lib/python3/dist-packages"
build --python_path="/usr/local/bin/python3"
build:opt --copt=-Wno-sign-compare
build:opt --host_copt=-Wno-sign-compare
test --flaky_test_attempts=3
test --test_size_filters=small,medium
test:v1 --test_tag_filters=-benchmark-test,-no_oss,-gpu,-oss_serial
test:v1 --build_tag_filters=-benchmark-test,-no_oss,-gpu
test:v2 --test_tag_filters=-benchmark-test,-no_oss,-gpu,-oss_serial,-v1only
test:v2 --build_tag_filters=-benchmark-test,-no_oss,-gpu,-v1only

It will have these permissions (with your username instead of mine):

-rw-rw-r-- 1 paul paul 581 Jun  7 13:49 .tf_configure.bazelrc

Just add the contents above to the .tf_configure.bazelrc file and try building it again, assuming that for you the Python 3 dist-packages reside there as well.

Let me know how it goes.

Thanks, ~p

pichuan commented 1 year ago

Thanks @pgrosu for helping out!

One thing to note is that DeepVariant isn't tested on Mac, and it's not currently something that we officially support. But good to know that there seems to be workarounds.

I'll keep this open for a bit longer in case @heznanda wants to follow up.

heznanda commented 1 year ago

Thank you for saying these words, @pichuan.

I believe we are at the very last steps of getting the DeepVariant running on Apple silicon, but since the singularity on linux system works for us, I think I will not pursue this installation anymore. I am grateful to @pgrosu who has been helping me with these 2 options. Perhaps this thread could help others who want to pursue the same pathway.

Regards,

Hez

pichuan commented 1 year ago

Thanks @heznanda for letting us know that you have a working solution. I'll close this issue. Other users can still find it later if they search of relevant keywords :)

pgrosu commented 1 year ago

Thank you for the kind words @heznanda and @pichuan. I am glad it was helpful, and might be to others as well.

The nice thing about having now gone through this exercise a few times is that it has given me clarity of the moving parts, that porting DeepVariant to other platforms should be a fairly trivial task, having multiple avenues to success.

In any case, as time permits I'm always here if folks need more help.

Thanks, ~p