bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
994 stars 353 forks source link

Samtools error when installing bcbio genome data in bcbio: error while loading shared libraries: libcrypto.so.1.0.0: cannot open shared object file: No such file or directory #3557

Open amizeranschi opened 3 years ago

amizeranschi commented 3 years ago

Version info

To Reproduce Exact bcbio command you have used:

bcbio_nextgen.py upgrade -u skip --genomes sacCer3 --aligners bwa --aligners bowtie2 --aligners hisat2 --aligners star

This results in the following:

Saccharomyces_cerevisiae.R64-1-1.dna.toplevel.fa.gz  100%[=====================================================================================================================>]   3,61M  3,76MB/s    in 1,0s    

2021-11-14 20:56:19 (3,76 MB/s) - written to stdout [3786555]

Sorted contigs saved to "seq/sacCer3.fa" ... 
samtools: error while loading shared libraries: libcrypto.so.1.0.0: cannot open shared object file: No such file or directory
Traceback (most recent call last):
  File "/home/user/bcbio/anaconda/bin/bcbio_nextgen.py", line 228, in <module>
    install.upgrade_bcbio(kwargs["args"])
  File "/home/user/bcbio/anaconda/lib/python3.7/site-packages/bcbio/install.py", line 107, in upgrade_bcbio
    upgrade_bcbio_data(args, REMOTES)
  File "/home/user/bcbio/anaconda/lib/python3.7/site-packages/bcbio/install.py", line 359, in upgrade_bcbio_data
    args.cores, ["ggd", "s3", "raw"])
  File "/home/user/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py", line 354, in install_data_local
    _prep_genomes(env, genomes, genome_indexes, ready_approaches, data_filedir)
  File "/home/user/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py", line 480, in _prep_genomes
    retrieve_fn(env, manager, gid, idx)
  File "/home/user/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py", line 875, in _install_with_ggd
    ggd.install_recipe(os.getcwd(), env.system_install, recipe_file, gid)
  File "/home/user/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/ggd.py", line 30, in install_recipe
    recipe["recipe"]["full"]["recipe_type"], system_install)
  File "/home/user/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/ggd.py", line 62, in _run_recipe
    subprocess.check_output(["bash", run_file])
  File "/home/user/bcbio/anaconda/lib/python3.7/subprocess.py", line 411, in check_output
    **kwargs).stdout
  File "/home/user/bcbio/anaconda/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['bash', '/home/user/bcbio/genomes/Scerevisiae/sacCer3/txtmp/ggd-run.sh']' returned non-zero exit status 127.

This suggests an older version of samtools is being used at some point. There are multiple versions of samtools installed in bcbio:

find bcbio -name samtools
bcbio/anaconda/pkgs/manta-1.6.0-h9ee0642_1/share/manta-1.6.0-1/libexec/samtools
bcbio/anaconda/pkgs/pysam-0.15.3-py27hbcae180_3/lib/python2.7/site-packages/pysam/include/samtools
bcbio/anaconda/pkgs/samtools-1.10-h2e538c0_3/bin/samtools
bcbio/anaconda/pkgs/pysam-0.15.4-py36h873a209_1/lib/python3.6/site-packages/pysam/include/samtools
bcbio/anaconda/pkgs/samtools-1.7-1/bin/samtools
bcbio/anaconda/pkgs/pysam-0.17.0-py37h45aed0b_0/lib/python3.7/site-packages/pysam/include/samtools
bcbio/anaconda/pkgs/multiqc-1.11-pyhdfd78af_0/site-packages/multiqc/modules/samtools
bcbio/anaconda/pkgs/samtools-0.1.19-h270b39a_9/bin/samtools
bcbio/anaconda/pkgs/samtools-1.13-h8c37831_0/bin/samtools
bcbio/anaconda/pkgs/samtools-1.12-h9aed4be_1/bin/samtools
bcbio/anaconda/pkgs/strelka-2.9.10-h9ee0642_1/share/strelka-2.9.10-1/libexec/samtools
bcbio/anaconda/pkgs/bioconductor-rsamtools-1.34.0-r351hf484d3e_0/lib/R/library/Rsamtools/include/samtools
bcbio/anaconda/lib/python3.7/site-packages/pysam/include/samtools
bcbio/anaconda/lib/python3.7/site-packages/multiqc/modules/samtools
bcbio/anaconda/bin/samtools
bcbio/anaconda/envs/htslib1.10/bin/samtools
bcbio/anaconda/envs/python2/share/manta-1.6.0-1/libexec/samtools
bcbio/anaconda/envs/python2/share/strelka-2.9.10-1/libexec/samtools
bcbio/anaconda/envs/python2/lib/python2.7/site-packages/pysam/include/samtools
bcbio/anaconda/envs/python2/bin/samtools
bcbio/anaconda/envs/bwakit/bin/samtools
bcbio/anaconda/envs/python3.6/lib/python3.6/site-packages/pysam/include/samtools
bcbio/anaconda/envs/python3.6/bin/samtools
bcbio/anaconda/envs/htslib1.12_py3.9/bin/samtools
bcbio/anaconda/envs/r35/lib/R/library/Rsamtools/include/samtools
bcbio/anaconda/envs/r35/bin/samtools
bcbio/anaconda/envs/samtools0/bin/samtools
bcbio/tools/bin/samtools

My PATH is configured like this:

export PATH=/home/user/bcbio/anaconda/bin:/home/user/bcbio/tools/bin:$PATH

And the main version of Samtools on my system, given the above, seems to be v1.13:

$ which samtools
/home/user/bcbio/anaconda/bin/samtools
$ samtools --version
samtools 1.13
Using htslib 1.13
Copyright (C) 2021 Genome Research Ltd.

There also seems to be an older version of Samtools getting symlinked in my bcbio tools directory. My guess is that this is probably being picked up somehow when installing the genome data.

$ /home/user/bcbio/tools/bin/samtools
/home/user/bcbio/tools/bin/samtools: error while loading shared libraries: libcrypto.so.1.0.0: cannot open shared object file: No such file or directory
$ ls -l /home/user/bcbio/tools/bin/samtools
lrwxrwxrwx 1 user user 42 nov 14 20:04 /home/user/bcbio/tools/bin/samtools -> ../../anaconda/envs/python3.6/bin/samtools

Removing this symlink from the tools directory got things working for me and I managed to install the genome data using the command mentioned above.

naumenko-sa commented 3 years ago

Hi @amizeranschi !

Thanks for the detailed reporting and sorry about the issues! This last update https://github.com/chapmanb/cloudbiolinux/pull/387 unfortunately, can' be applied by bcbio_nextgen.py -u skip upgrade - you'd better re-install the code and tools from scratch and link the data. https://bcbio-nextgen.readthedocs.io/en/latest/contents/intro.html#install-bcbio-python-package-and-tools

There might be samtools issues as well, since we now have multiple htslib environments - please update here if you still see the error in the new instllalation.

Sergey

amizeranschi commented 3 years ago

Hi Sergey,

Thanks for the reply. Everything I reported here was based on a fresh install of bcbio, completely from scratch, on a new machine. This was done on the 14th of November and I followed the documentation you linked above.

The update you mentioned seems to be from the 28th of October, so I'm guessing that my install should have used the updated code. Here's what I did:

cd ${HOME}
wget https://raw.githubusercontent.com/bcbio/bcbio-nextgen/master/scripts/bcbio_nextgen_install.py
python3 bcbio_nextgen_install.py ${HOME}/bcbio --tooldir=${HOME}/bcbio/tools --nodata
## add the locations to the bcbio-nextgen dependency executables to the system's $PATH variable
echo "export PATH=${HOME}/bcbio/anaconda/bin:${HOME}/bcbio/tools/bin:\$PATH" >> ~/.bashrc
source ~/.bashrc
## download genome data for sacCer3
bcbio_nextgen.py upgrade -u skip --genomes sacCer3 --aligners bwa --aligners bowtie2 --aligners hisat2 --aligners star

I can retry the installation within the following days, but is there any reason to believe there should be a different outcome, regarding the Samtools errors I mentioned above?

amizeranschi commented 3 years ago

Update:

As advised, I tried reinstalling Bcbio from scratch, but it didn't make any difference. I am still in the same situation as outlined above.

When I try installing the data, Bcbio tries to use the old samtools version that gets symlinked in the tools/bin directory and runs into the error mentioned in the title.

amizeranschi commented 3 years ago

Update 2:

As mentioned in https://github.com/bcbio/bcbio-nextgen/issues/3561, I've upgraded bcbio to the latest development version. Unlike there, however, this didn't fix the issue mentioned here.

I can hack around it by renaming or removing the samtools symlink from the tools directory, but it might still be good to solve this before the next stable release.

naumenko-sa commented 2 years ago

HI @amizeranschi !

Thanks for the follow up and for the patience! Yes, indeed this old samtools should not be called. Please unlink it:

cd /home/user/bcbio/tools/bin/
unlink samtools

I've pushed a fix that it won't be linked in the first place anymore: https://github.com/chapmanb/cloudbiolinux/pull/393

Let me know if that helps with the installation.

Sergey

amizeranschi commented 2 years ago

Hi @naumenko-sa

Strangely enough, I still seem to be getting this issue. Once again, on a completely fresh bcbio install, from scratch. Upgrading bcbio and tools to development didn't help.

This time, the culprit samtools executable is in bcbio_dir/anaconda/bin:

[a.mizeranschi@haswell-wn30 ~]$ which samtools
~/bcbio-nextgen/anaconda/bin/samtools
[a.mizeranschi@haswell-wn30 ~]$ samtools
samtools: error while loading shared libraries: libcrypto.so.1.0.0: cannot open shared object file: No such file or directory
[a.mizeranschi@haswell-wn30 ~]$ mamba list | grep samtools
bioconductor-rsamtools    2.10.0            r41h619a076_1    bioconda
perl-bio-samtools         1.43            pl5321h7132678_3    bioconda
samtools                  1.7                           1    bioconda

Any help here would be much appreciated.

amizeranschi commented 2 years ago

I tried pinning the samtools version to 1.12, but ran into errors:

[a.mizeranschi@haswell-wn30 ~]$ mamba install -c bioconda samtools=1.12

[...]

Encountered problems while solving:
  - package cyvcf2-0.30.11-py37h31aceb7_2 requires htslib >=1.13,<1.14.0a0, but none of the providers can be installed

However, this seems to work if I request samtools=1.13:

  Package     Version  Build       Channel                    Size
────────────────────────────────────────────────────────────────────
  Upgrade:
────────────────────────────────────────────────────────────────────

  - samtools      1.7  1           installed
  + samtools     1.13  h8c37831_0  bioconda/linux-64        397 KB

  Downgrade:
────────────────────────────────────────────────────────────────────

  - ldc        1.28.1  hcf88599_0  installed
  + ldc        1.20.0  h9a1ace1_1  conda-forge/linux-64      37 MB
  - ncurses       6.3  h9c3ff4c_0  installed
  + ncurses       6.2  h58526e2_4  conda-forge/linux-64     Cached
  - sqlite     3.37.1  h4ff8645_0  installed
  + sqlite     3.37.0  h9cd32fc_0  conda-forge/linux-64     Cached

  Summary:

  Upgrade: 1 packages
  Downgrade: 3 packages

  Total download: 37 MB

────────────────────────────────────────────────────────────────────

Confirm changes: [Y/n]
mjsteinbaugh commented 2 years ago

I'm seeing this samtools issue as well with the current stable release. Working on debugging.

mjsteinbaugh commented 2 years ago

Yeah it's installing samtools 1.7 on my machine as well. Seeing if pinning to 1.13 works, as you mentioned.

mjsteinbaugh commented 2 years ago

As mentioned in https://github.com/bcbio/bcbio-nextgen/issues/3632, we likely need to double check the pinning of bcftools as well.

mjsteinbaugh commented 2 years ago

This hotfix seems to be working for me so far:

/opt/koopa/opt/bcbio-nextgen/install/anaconda/bin/mamba install \
    --name 'base' \
    'bcftools==1.15' \
    'samtools==1.15'
Transaction

  Prefix: /opt/koopa/app/bcbio-nextgen/1.2.9/install/anaconda

  Updating specs:

   - bcftools==1.15
   - samtools==1.15
   - ca-certificates
   - certifi
   - openssl

  Package          Version  Build           Channel                    Size
─────────────────────────────────────────────────────────────────────────────
  Change:
─────────────────────────────────────────────────────────────────────────────

  - fastp           0.23.2  h79da9fb_0      installed
  + fastp           0.23.2  hb7a2d85_2      bioconda/linux-64          3 MB
  - libtiff          4.3.0  hf544144_1      installed
  + libtiff          4.3.0  h542a066_3      conda-forge/linux-64     Cached
  - r-base           4.1.1  hb67fd72_0      installed
  + r-base           4.1.1  hb93adac_1      conda-forge/linux-64      25 MB
  - staden_io_lib  1.14.14  h7c09d56_1      installed
  + staden_io_lib  1.14.14  h0d9da7e_3      bioconda/linux-64        792 KB

  Upgrade:
─────────────────────────────────────────────────────────────────────────────

  - bcftools          1.13  h3a49de5_0      installed
  + bcftools          1.15  h0ea216a_2      bioconda/linux-64        852 KB
  - cyvcf2         0.30.11  py37h31aceb7_2  installed
  + cyvcf2         0.30.15  py37h6841c58_0  bioconda/linux-64        929 KB
  - gsl                2.6  he838d99_2      installed
  + gsl                2.7  he838d99_0      conda-forge/linux-64     Cached
  - htslib            1.13  h9093b5e_0      installed
  + htslib            1.15  h9753748_0      bioconda/linux-64        Cached
  - lerc             2.2.1  h9c3ff4c_0      installed
  + lerc               3.0  h9c3ff4c_0      conda-forge/linux-64     Cached
  - libdeflate         1.7  h7f98852_5      installed
  + libdeflate        1.10  h7f98852_0      conda-forge/linux-64     Cached
  - pysam           0.17.0  py37h45aed0b_0  installed
  + pysam           0.18.0  py37h8fe4cdf_2  bioconda/linux-64          3 MB
  - samtools           1.7  1               installed
  + samtools          1.15  h1170115_1      bioconda/linux-64        392 KB
  - tk              8.6.10  hbc83047_0      installed
  + tk              8.6.12  h27826a3_0      conda-forge/linux-64     Cached
  - zstd             1.5.0  ha4553b6_1      installed
  + zstd             1.5.2  ha95c52a_0      conda-forge/linux-64     Cached

  Summary:

  Change: 4 packages
  Upgrade: 10 packages

  Total download: 33 MB
mjsteinbaugh commented 2 years ago

Still not working, now seeing this error on an RNA-seq run: samtools sort: failed to read header. Trying to pin against 1.13 instead to see if that works.

See potentially related 1.13 pinning: https://github.com/chapmanb/cloudbiolinux/blob/c7fbc5c15629810a55c2c0df2e9a482410c80205/contrib/flavor/ngs_pipeline_minimal/packages-conda.yaml#L114

mjsteinbaugh commented 2 years ago

Yeah still hitting this error: samtools sort: failed to read header from "/dev/stdin"

Update: this only happens when running STAR

amizeranschi commented 2 years ago

@mjsteinbaugh

FYI, I couldn't reproduce the samtools sort error you mentioned above. I'm testing using the RNA-seq scenario described in my older issue report, here: https://github.com/bcbio/bcbio-nextgen/issues/3565#issuecomment-997212051

My Samtools version is also 1.15.

P.S. Are you any closer to making a new release for r-acidgenomes in bioconda? It would be great to have bcbiornaseq working within bcbio once again.

mjsteinbaugh commented 2 years ago

@amizeranschi working on updating r-bcbiornaseq (https://github.com/bioconda/bioconda-recipes/pull/33652) and r-acidgenomes (https://github.com/bioconda/bioconda-recipes/pull/33651). Track those open pull requests for updates.

I'm going to run the bcbio unit tests and see what's up with the current install.

amizeranschi commented 2 years ago

Alright, thanks for the update, I've tracked those pull requests.

If you do try to run the code I've linked, you will first need sacCer3 genome data available in bcbio. This is pretty small and you can get it via:

## download genome data for sacCer3
bcbio_nextgen.py upgrade -u skip --genomes sacCer3 --aligners bwa --aligners bowtie2 --aligners hisat2 --aligners star

You'd also want to comment out lines 217-220, which enable bcbiornaseq.

naumenko-sa commented 2 years ago

Hi @mjsteinbaugh and @amizeranschi !

Sorry, I dropped the ball.

Interestingly, my ~january 1.2.9 installation received samtools 1.13 in the base env, as needed.

But the fresh one indeed received samtools1.7 in the base.

It has to be 1.13, not 1.15, I fixed it with mamba install samtools=1.13

it is weird why conda solved it like that, bcftools was pinned to 1.13. I pinned samtools as well: https://github.com/chapmanb/cloudbiolinux/blob/master/contrib/flavor/ngs_pipeline_minimal/packages-conda.yaml#L113

I appreciate if anybody tests.

Sergey

amizeranschi commented 2 years ago

I've tried another fresh install and now it ends up with bcbio-nextgen 1.2.8 at the end. Here's what I tried:


cd

wget https://raw.githubusercontent.com/bcbio/bcbio-nextgen/master/scripts/bcbio_nextgen_install.py

python3 bcbio_nextgen_install.py /home/test/bcbio-test --tooldir=/home/test/bcbio-test/tools --nodata --mamba

export PATH=/home/test/bcbio-test/anaconda/bin:/home/test/bcbio-test/tools/bin:$PATH

bcbio_nextgen.py --version

This reports 1.2.8. Having a look at the output, I see the following:

  Downgrade:
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

  - bcbio-nextgen                                                1.2.9  pyh5e36f6f_2            installed                      
  + bcbio-nextgen                                                1.2.8  pyh5e36f6f_0            bioconda/noarch            2 MB

A collaborator, @marianastase0912, had a similar thing happen to her fresh bcbio install.

Upgrading to the latest development version doesn't fix this, either, as the same downgrade happens at some point near the end of the install.

naumenko-sa commented 2 years ago

Thanks for testing @amizeranschi

Yes, it replicates for me as well in the fresh installation.

The workaround for now: mamba install openssl=1.1.1l bcbio-nextgen=1.2.9

I am fixing bcbio recipe to solve this issue https://github.com/bioconda/bioconda-recipes/blob/master/recipes/bcbio-nextgen/meta.yaml#L43

I pinned openssl, because new openssl3.0 shifts a lot of packages openssl1.1.1n is released, and I think conda downgrades bcbio instead of keeping openssl 1.1.1l

Sergey

naumenko-sa commented 2 years ago

Fixing it here: https://github.com/bioconda/bioconda-recipes/pull/34746

mjsteinbaugh commented 2 years ago

Awesome thanks @naumenko-sa , happy to help test this out on an AWS instance.

alasfar-lina commented 1 year ago

Yeah still hitting this error: samtools sort: failed to read header from "/dev/stdin"

Update: this only happens when running STAR

@mjsteinbaugh I know that I am late here.. But I have had the same problem with STAR .. After a lot of "suffering" , I have discovered that this is a RAM problem.. STAR uses a lot of RAM, and if the resources are not managed properly.. you will get this exact error.