google / deepvariant

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
BSD 3-Clause "New" or "Revised" License
3.23k stars 727 forks source link

bioconda installation v0.7.2 #137

Closed drtamermansour closed 5 years ago

drtamermansour commented 5 years ago

I am trying to use bioconda Deepvariant (https://anaconda.org/bioconda/deepvariant ) on a cluster with CentOS 7. I am getting this error

ERROR conda.core.link:_execute(507): An error occurred while installing package 'bioconda::deepvariant-0.7.2-py27h5d9141f_1'. LinkError: post-link script failed for package bioconda::deepvariant-0.7.2-py27h5d9141f_1 running your command again with -v will provide additional information location of failed script: /mnt/home/mansourt/miniconda3/envs/deepVar/bin/.deepvariant-post-link.sh ==> script messages <==

drtamermansour commented 5 years ago

Running with -v, everything looks normal except 2 error messages:

1st error message:

===> LINKING PACKAGE: conda-forge::linecache2-1.0.0-py_1 <=== prefix=/mnt/home/mansourt/miniconda3/envs/deepVar source=/mnt/home/mansourt/miniconda3/pkgs/linecache2-1.0.0-py_1

pyc file failed to compile successfully python_exe_full_path: /mnt/home/mansourt/miniconda3/envs/deepVar/bin/python2.7 py_full_path: /mnt/home/mansourt/miniconda3/envs/deepVar/lib/python2.7/site-packages/linecache2/tests/inspect_fodder2.py pyc_full_path: /mnt/home/mansourt/miniconda3/envs/deepVar/lib/python2.7/site-packages/linecache2/tests/inspect_fodder2.pyc compile rc: 1 compile stdout: Compiling lib/python2.7/site-packages/linecache2/tests/inspect_fodder2.py ... File "lib/python2.7/site-packages/linecache2/tests/inspect_fodder2.py", line 102 def keyworded(*arg1, arg2=1): ^ SyntaxError: invalid syntax

compile stderr:

2nd error message:

$ bash -x /mnt/home/mansourt/miniconda3/envs/deepVar/bin/.deepvariant-post-link.sh ==> cwd: /mnt/home/mansourt/miniconda3/envs/deepVar/bin <== ==> exit code: 1 <== ==> stdout <== b'Shell debugging temporarily silenced: export LMOD_SH_DBG_ON=1 for this output (/usr/local/lmod/lmod/init/bash)\nShell debugging restarted\n' ==> stderr <== b'+ \'[\' -z \'\' \']\'\n+ case "$-" in\n+ lmod_vx=x\n+ \'[\' -n x \']\'\n+ set +x\n+ unset lmod_vx\n+ set -eu -o pipefail\n+ MODEL_VERSION=0.7.2\n+ GSUTIL=/mnt/home/mansourt/miniconda3/envs/deepVar/bin/gsutil\n+ for MODEL_TYPE in wgs wes\n+ MODEL_NAME=DeepVariant-inception_v3-0.7.2+data-wgs_standard\n+ GSREF=gs://deepvariant/models/DeepVariant/0.7.2/DeepVariant-inception_v3-0.7.2+data-wgs_standard\n+ OUTDIR=/mnt/home/mansourt/miniconda3/envs/deepVar/share/deepvariant-0.7.2-1/models/DeepVariant/0.7.2/DeepVariant-inception_v3-0.7.2+data-wgs_standard\n+ mkdir -p /mnt/home/mansourt/miniconda3/envs/deepVar/share/deepvariant-0.7.2-1/models/DeepVariant/0.7.2/DeepVariant-inception_v3-0.7.2+data-wgs_standard\n+ /mnt/home/mansourt/miniconda3/envs/deepVar/bin/gsutil cp \'gs://deepvariant/models/DeepVariant/0.7.2/DeepVariant-inception_v3-0.7.2+data-wgs_standard/*\' /mnt/home/mansourt/miniconda3/envs/deepVar/share/deepvariant-0.7.2-1/models/DeepVariant/0.7.2/DeepVariant-inception_v3-0.7.2+data-wgs_standard/\nTraceback (most recent call last):\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gsutil", line 22, in \n gsutil.RunMain()\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gsutil.py", line 114, in RunMain\n sys.exit(gslib.main.main())\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/main.py", line 383, in main\n perf_trace_token=perf_trace_token)\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/main.py", line 577, in _RunNamedCommandAndHandleExceptions\n collect_analytics=True)\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/command_runner.py", line 317, in RunNamedCommand\n return_code = command_inst.RunCommand()\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/commands/cp.py", line 1139, in RunCommand\n seek_ahead_iterator=seek_ahead_iterator)\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/command.py", line 1368, in Apply\n arg_checker, should_return_results, fail_on_error)\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/command.py", line 1414, in _SequentialApply\n args = args_iterator.next()\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/name_expansion.py", line 622, in next\n name_expansion_result = self.current_expansion_iter.next()\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/plurality_checkable_iterator.py", line 60, in _PopulateHead\n e = self.base_iterator.next()\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/name_expansion.py", line 265, in iter\n for (names_container, blr) in post_step3_iter:\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/plurality_checkable_iterator.py", line 60, in _PopulateHead\n e = self.base_iterator.next()\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/name_expansion.py", line 468, in iter__\n for (names_container, blr) in self.tuple_iter:\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/plurality_checkable_iterator.py", line 60, in _PopulateHead\n e = self.base_iterator.next()\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/name_expansion.py", line 432, in iter__\n for blr in self.blr_iter:\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/plurality_checkable_iterator.py", line 60, in _PopulateHead\n e = self.base_iterator.next()\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/wildcard_iterator.py", line 476, in IterAll\n expand_top_level_buckets=expand_top_level_buckets):\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/wildcard_iterator.py", line 215, in iter\n provider=self.wildcard_url.scheme, fields=listing_fields):\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/gcs_json_api.py", line 595, in ListObjects\n global_params=global_params)\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/third_party/storage_apitools/storage_v1_client.py", line 1237, in List\n config, request, global_params=global_params)\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/third_party/apitools/apitools/base/py/base_api.py", line 701, in _RunMethod\n http, http_request, **opts)\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/third_party/apitools/apitools/base/py/http_wrapper.py", line 351, in MakeRequest\n max_retry_wait, total_wait_sec))\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/gslib/util.py", line 1719, in WarnAfterManyRetriesHandler\n http_wrapper.HandleExceptionsAndRebuildHttpConnections(retry_args)\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/third_party/apitools/apitools/base/py/http_wrapper.py", line 341, in MakeRequest\n check_response_func=check_response_func)\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/third_party/apitools/apitools/base/py/http_wrapper.py", line 391, in _MakeRequestNoRetry\n redirections=redirections, connection_type=connection_type)\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/third_party/oauth2client/oauth2client/client.py", line 616, in new_request\n self._refresh(request_orig)\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/third_party/oauth2client/oauth2client/client.py", line 885, in _refresh\n self._do_refresh_request(http_request)\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/third_party/oauth2client/oauth2client/client.py", line 939, in _do_refresh_request\n self.store.locked_put(self)\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/third_party/oauth2client/oauth2client/contrib/multistore_file.py", line 271, in locked_put\n self._multistore._update_credential(self._key, credentials)\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/third_party/oauth2client/oauth2client/contrib/multistore_file.py", line 475, in _update_credential\n self._write()\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/third_party/oauth2client/oauth2client/contrib/multistore_file.py", line 441, in _write\n self._locked_json_write(raw_data)\n File "/mnt/home/mansourt/miniconda3/envs/deepVar/share/google-cloud-sdk-166.0.0-0/platform/gsutil/third_party/oauth2client/oauth2client/contrib/multistore_file.py", line 367, in _locked_json_write\n self._file.file_handle().truncate()\nIOError: [Errno 9] Bad file descriptor\n'

chapmanb commented 5 years ago

Tamer; Thanks for the report and apologies about the install issue. This looks like you might be trying to install inside an miniconda environment running python 3. Is that a possibility? Right now, gsutil and deepvariant are only compatible with python 2.7 so using miniconda 2 (https://repo.anaconda.com/miniconda/Miniconda2-latest-Linux-x86_64.sh) or a python=2 environment inside your existing conda install will hopefully resolve the problem.

drtamermansour commented 5 years ago

Thanks for the response but actually no. I am using py2.7

python --version Python 2.7.15

gunjanbaid commented 5 years ago

Hi Tamer, I tried to replicate this issue, and here is what I found. I created an instance from this CentOS7 VM. I chose the default location for all installations. When asked if I wanted to update my PATH during installations, I chose to do so. I was able to install DeepVariant through Bioconda using the below steps.

I ran into a particular error with gsutil. After running source ~/.bashrc, I saw an error when I ran gsutil. gsutil is used by the DeepVariant installation, so that failed as well. To address this, I referenced this post and ran export BOTO_CONFIG=/dev/null before installing DeepVariant again.

Running these commands in order allows me to successfully install on the VM.

# install gsutil
curl https://sdk.cloud.google.com | bash
exec -l $SHELL
# verify that gsutil is working
gsutil

# install wget and bzip2, which are both needed to download miniconda
sudo yum install bzip2 wget
wget https://repo.anaconda.com/miniconda/Miniconda2-latest-Linux-x86_64.sh
bash Miniconda2-latest-Linux-x86_64.sh 
source ~/.bashrc

# gsutil is failing now
gsutil
export BOTO_CONFIG=/dev/null
# gsutil should be working again
gsutil

# create new conda env, add channels, install deepvaraint
conda create -n dv python=2.7
conda activate dv
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda install -n dv deepvariant -v

In the output from running conda install -n dv deepvariant -v, I see the first error you posted even with a successful installation. I was not able to replicate the second error. Some sanity checks for you:

CC @melkerdawy

drtamermansour commented 5 years ago

Thank you for the response I reordered the conda channels to match yours and installed gsutil but using conda and thus I did not need to edit the path. This is the set of commands I used:

conda create -n deepvariant python=2.7
source activate deepvariant
conda install -c conda-forge google-cloud-sdk
conda install -v -y deepvariant &> deepvariant_insatll.log

I got a successful installation inspite of the first error message just like you However, running the code is producing another error:

python $HOME/miniconda3/envs/deepVar/share/deepvariant-0.7.2-1/binaries/DeepVariant/0.7.2/DeepVariant-0.7.2+cl-225213413/make_examples.zip \
 --mode training  --reads "${BAM}"  --ref "${REF}"  --examples "$training.tfrecord.gz" \
 --truth_variants "${TRUTH_VCF}"  --confident_regions "${TRUTH_BED}" \
 --exclude_regions "chr20:14000000-15000000"  --sample_name "train" 
Traceback (most recent call last):
  File "/tmp/Bazel.runfiles_4i44qy/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 38, in <module>
    import tensorflow as tf
  File "/mnt/home/mansourt/miniconda3/envs/deepvariant/lib/python2.7/site-packages/tensorflow/__init__.py", line 30, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/mnt/home/mansourt/miniconda3/envs/deepvariant/lib/python2.7/site-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/mnt/home/mansourt/miniconda3/envs/deepvariant/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/mnt/home/mansourt/miniconda3/envs/deepvariant/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/mnt/home/mansourt/miniconda3/envs/deepvariant/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/mnt/home/mansourt/miniconda3/envs/deepvariant/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
ImportError: /opt/software/GCCcore/6.4.0/lib64/libstdc++.so.6: version `CXXABI_1.3.11' not found (required by /mnt/home/mansourt/miniconda3/envs/deepvariant/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so)

I upgraded to a higher version of GNU and re-ran but I got a nother error

module load GNU/7.3.0-2.30

python $HOME/miniconda3/envs/deepVar/share/deepvariant-0.7.2-1/binaries/DeepVariant/0.7.2/DeepVariant-0.7.2+cl-225213413/make_examples.zip \
 --mode training  --reads "${BAM}"  --ref "${REF}"  --examples "$training.tfrecord.gz" \
 --truth_variants "${TRUTH_VCF}"  --confident_regions "${TRUTH_BED}" \
 --exclude_regions "chr20:14000000-15000000"  --sample_name "train" 
Traceback (most recent call last):
  File "/tmp/Bazel.runfiles_FlJ2h7/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 41, in <module>
    from deepvariant import pileup_image
  File "/tmp/Bazel.runfiles_FlJ2h7/runfiles/com_google_deepvariant/deepvariant/pileup_image.py", line 42, in <module>
    from third_party.nucleus.util import ranges
  File "/tmp/Bazel.runfiles_FlJ2h7/runfiles/com_google_deepvariant/third_party/nucleus/util/ranges.py", line 42, in <module>
    from third_party.nucleus.io import bed
  File "/tmp/Bazel.runfiles_FlJ2h7/runfiles/com_google_deepvariant/third_party/nucleus/io/bed.py", line 79, in <module>
    from third_party.nucleus.io.python import bed_reader
ImportError: /lib64/libm.so.6: version `GLIBC_2.23' not found (required by /tmp/Bazel.runfiles_FlJ2h7/runfiles/com_google_deepvariant/third_party/nucleus/io/python/../../../../_solib_k8/libexternal_Shtslib_Slibhtslib.so)

I tried to install local GLIBC. I tried v2.25 ans v2.28 using conda but the installation failed. Any suggestions to move forward?

pgrosu commented 5 years ago

@drtamermansour Why don't you download glibc 2.23 from here:

https://ftp.gnu.org/gnu/glibc/

Then either inline LD_LIBRARY_PATH or export it with the location of glibc 2.23 being one of the first libraries locations it searches for. Then try rerunning the program.

Hope it helps, ~p

drtamermansour commented 5 years ago

@pgrosu I could not compile the library on my server

I followed the suggestion here. I added CFLAGS="-O2" to address an optimization request error but still the make command fails to compile

mkdir glibc && cd glibc
wget https://ftp.gnu.org/gnu/glibc/glibc-2.23.tar.gz
tar xvzf glibc-2.23.tar.gz
mkdir glibc-build && cd glibc-build
mkdir ../install
../glibc-2.23/configure CFLAGS="-O2" --prefix $HOME/glibc/install
make -j `nproc`
pgrosu commented 5 years ago

Right, so now update your LD_LIBRARY_PATH with this version closer to the beginning of it. Make sure you echo it first to see what it's set to via echo $LD_LIBRARY_PATH as you might want to include those things as well. Study the following two links for more information:

https://www.tecmint.com/understanding-shared-libraries-in-linux/ https://docs.oracle.com/cd/E19455-01/816-0559/chapter2-48927/index.html

LD_LIBRARY_PATH is a string of colon-separated paths that a program will search (from left-to-right) through for the libraries it needs. You don't have to export it if you want to test things like this:

LD_LIBRARY_PATH=....(your paths)... python $HOME/miniconda3/envs/deepVar...

drtamermansour commented 5 years ago

@pgrosu I am sorry if I was not clear enough. I tried to say that I could not install glibc locally on my system. I started the contact with the system administrator to see what they can do.

pgrosu commented 5 years ago

@drtamermansour You don't have to install glibc, you just need to compile it in a local directory. Basically you just need to run ./configure and make without running make install. Then just update LD_LIBRARY_PATH to include the local directory of the compiled glibc .so file.

gunjanbaid commented 5 years ago

@drtamermansour Were you able to resolve the problem? I'll close this issue for now, but feel free to reopen if you have any other questions!

frapaport commented 5 years ago

Hi everyone,

I am trying to install the DeepVariant bioconda on RedHat Entreprise Server 7.2 I am really not familiar with conda but this looks like the most straightforward way to run deepvariant on a machine for which I do not (and will never get) sudo privileges.

The above discussion helped to pass a lot of kinks but I am still struggling.

I am having the two following problems :

Do you have any idea about how to solve this ? Any help would be greatly appreciated.

Thanks !

pichuan commented 5 years ago

Hi @frapaport , I would like to understand better how we can help the users that don't have root permission. One question for you Given that you don't have root permission on your machine, I assume that using our pre-built binaries is also not possible. (Because it requires running run-prereq.sh, which currently uses sudo to install a bunch of stuff.)

Other than bioconda, are there other common ways to run/install softwares that you think works well? For example, several of our users mentioned Singularity. Have you used that, and would you consider that?

(Currently I'm personally not very familiar with either bioconda or singularity. I'm trying to get a better understanding of what will be more generally useful for users who don't have root permission.) If you have any suggestions, please let me know!

I'll come back to your questions later as well. Might take me a while to try this again.

frapaport commented 5 years ago

Hi @pichuan ,

I am actually not familiar with singularity. I checked the docs and my understanding is that the installation requires root privileges but, according to a quick forum searches, there are some ways to get around it.

Most of the software I have installed is either available through R (distributed in bioconductor) or distributed as a JAR archive. I have had to run a few makefiles but I think deepvariant is the most complex installation I took care of myself (without the help of a sysadmin) in a long time.

Thanks for your help !

pichuan commented 5 years ago

@frapaport an update on Singularity - I've tested our latest setting (which will come out in the next release) by converting it in to a Singularity image. It seems to work fine for me. So, if you would be able to install singularity, that will be an easier way forward once our next release is out. I'll still come back and revisit the usability of our bioconda installation. But might take a while.

kokyriakidis commented 5 years ago

Please make sure you have a GPUs-enabled singularity image! :)

frapaport commented 5 years ago

Thanks @pichuan , I will wait for the Singularity image then. If Singularity works then I will most likely not need the bioconda install.

hmyh1202 commented 5 years ago

too slow to download the main soft

pichuan commented 5 years ago

Hi @frapaport , I added some notes about Singularity in https://github.com/google/deepvariant/issues/132#issuecomment-482430728 . I'm still figuring out what's a best way to distribute images. I do have the image files that I built. If it's useful to share those files, let me know. @kokyriakidis I have an example for a GPU run in that comment as well.

And, addressing the original topic about bioconda, I'll get in touch with @chapmanb to see how to update the version to 0.8.0 and I'll also see if I can try it out more myself as well.

pichuan commented 5 years ago

This one seems a bit outdated. It seems like 0.8.0 is out and some issues were resolved already. I'll close this again...