sunbeam-labs / sunbeam

A robust, extensible metagenomics pipeline
http://sunbeam.readthedocs.io
166 stars 40 forks source link

rbt: error while loading shared libraries: libcblas.so.3: cannot open shared object file #200

Closed guanxiangliang closed 5 years ago

guanxiangliang commented 5 years ago

I had this problem today with a brand new installation of Sunbeam without installing anything else:

[Tue Apr 16 10:37:53 2019] rbt: error while loading shared libraries: libcblas.so.3: cannot open shared object file: No such file or directory [Tue Apr 16 10:37:53 2019] Error in rule remove_low_complexity:

@louiejtaylor and I tried conda install --force-reinstall rust-bio-tools -c bioconda did not fix it.

However, conda install --force-reinstall libcblas -c conda-forge did fix it.

louiejtaylor commented 5 years ago

@ArwaAbbas had the same problem--both of these are on PMACs. I can't reproduce it myself.

ressy commented 5 years ago

Possibly you were missing libcblas and then installing it directly fixed the problem, but it should have been brought in automatically to begin with, especially on a new install where I would think everything should start up to date. Do you have anything installed directly into your root conda environment (just do conda list without being in an activated environment)? Checking the latest package metadata now, rust-bio-tools depends on gsl and gsl depends on libcblas but the rust-bio-tools recipe has changed a bit in recent weeks so maybe there was a lingering mismatch like we saw for openssl. EDIT: gsl only depends on libcblas for the host list,not run; maybe this is part of the trouble.

@guanxiangliang and/or @ArwaAbbas, are you still having trouble? I'd be interested to catch this in the act, if so.

guanxiangliang commented 5 years ago

No problem for me so far. The only thing I did using conda before Sunbeam is conda install git. Here the conda list:

Screen Shot 2019-04-17 at 9 39 10 PM
louiejtaylor commented 5 years ago

PMACs doesn't have git installed...I know Arwa also conda install'd both git and wget, if it is relevant!

louiejtaylor commented 5 years ago

@ressy reproduced! On a "miniconda-naive" machine (microb191 after blowing away my sunbeam/miniconda3 dirs):

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
# restart terminal
git clone https://github.com/sunbeam-labs/sunbeam
cd sunbeam; ./install.sh
conda activate sunbeam
sunbeam init try_to_break_sunbeam --data_acc SRR1913936
sunbeam run -p --configfile try_to_break_sunbeam/sunbeam_config.yml all_decontam

The conda install git was not important, as the error persists without it. I am currently testing to see whether the problem is still present with the Sunbeam-installed Miniconda (I don't really see why it wouldn't).

SRR1913936 is a tiny sample with 11 paired reads that I like to use for testing. Also, for these tests, I've just been removing the sunbeam and miniconda dirs, but the conclusions of this might be confounded if there are other places where packages are cached.

Edit: Same problem without a pre-installed miniconda3. Reproducible with:

git clone https://github.com/sunbeam-labs/sunbeam
cd sunbeam; ./install.sh
# restart terminal
conda activate sunbeam
sunbeam init try_to_break_sunbeam --data_acc SRR1913936
sunbeam run -p --configfile try_to_break_sunbeam/sunbeam_config.yml all_decontam

Edit 2:

Weirdly, this is not fixed by adding libcblas to environment.yml.

Edit 3:

I should read more closely. The error for each of these attempts is actually not the same as the error Guanxiang and Arwa were getting, although it is also at the komplexity step. Mine is:

rbt: error while loading shared libraries: libopenblas.so.0: cannot open shared object file: No such file or directory

Installing with libcblas and libopenblas in environment.yml fixes this issue (also works fine with libopenblas only). Adding stuff to the environment is an easy fix, but it seems like there's an upstream problem with rust-bio-tools that is causing this. Thoughts @ressy/@eclarke?

ressy commented 5 years ago

@louiejtaylor yep I can confirm the missing libopenblas library too, with a fresh install of everything in an isolated environment. I'll do some more digging. Thanks for spotting that.

ressy commented 5 years ago

I think this is another case of inconsistent dependencies.

In my test environment (where the whole conda setup is inside the working directory) I can see that libcblas and libopenblas are not handled via the environment when rust-bio-tools is installed. On this one particular server we happen to have a libcblas file from the OS but not libopenblas so we're just seeing the one problem here, but it looks like both could be an issue, I think.

$ ldd $(which rbt) | grep -v $(pwd)
    linux-vdso.so.1 =>  (0x00007ffcfe6d3000)
    libcblas.so.3 => /usr/lib/libcblas.so.3 (0x00007f9096837000)
    libopenblas.so.0 => not found
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f909652e000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f9096164000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f9097634000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f9095f60000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f9095d58000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f9095b3b000)
    libatlas.so.3 => /usr/lib/libatlas.so.3 (0x00007f909559d000)

It's flipped on a PMACS HPC compute node which I think explains the difference between your and Guanxiang's experience:

    ...
    libcblas.so.3 => not found
    libopenblas.so.0 => /lib64/libopenblas.so.0 (0x00002ae9c6e0d000)
    ...

Possibly this relates to the "host" versus "run" dependency lists in the recipe YAML files for both rust-bio-tools and for gsl, since they doesn't include any blas stuff for "run". I'm still a bit fuzzy on how the build/host/run split (edit: see also Anaconda's info) is supposed to work, but if those are libraries needed to run the compiled executable, they should be included on rust-bio-tools directly under the run list, right? More digging needed, I suppose!

ressy commented 5 years ago

I tried creating a really minimal example with just a rust-bio-tools install in an isolated environment, and whether it works depends on what channels are enabled and exactly which conda command is used.

With conda-forge and bioconda both enabled it actually does fine, pulling in the libraries via conda-forge. With just bioconda it lets you install a broken rbt binary since it leaves out the libraries it would otherwise get from conda-forge. With all four channels we have explicitly listed at install time enabled it also works, oddly enough. And then if I use the environment file approach the Sunbeam installer uses with conda env update, which I would think should be the same as listing all channels in a conda install or conda create, it breaks.

So I have two separate points to figure out for rust-bio-tools:

louiejtaylor commented 5 years ago

Thank you for tracking this down! Does reordering the conda channels in environment.yml as in #181 have any effect?

ressy commented 5 years ago

Oh yeah, good point, since we're on the topic of channels. I don't actually have a ~/.condarc in this case, though, (with $HOME as a temporary directory) so it's starting from complete defaults. Sunbeam provides its own list in the .yml file, and for other commands I'm entering -c channel1 -c channel2 etc. to test... but no, it doesn't seem to make a difference when I swap them around here.

Here's an interesting comparison with two different environment.yml files but the same channels and order defined.

Running conda env update --name rbt-test --file environment.yml with this file works:

channels:
  - bioconda
  - conda-forge
dependencies:
  - rust-bio-tools

And this one breaks:

channels:
  - bioconda
  - conda-forge
dependencies:
  - biopython
  - rust-bio-tools

So what the heck happens to my dependency resolution when biopython is added in? I should have more time to figure that out this afternoon I think.

(Edit: Also, I tested this by removing biopython from the Sunbeam environment.yml but leaving everything else, just to confirm that it's the presence/absence of the biopython entry. Same deal.)

ressy commented 5 years ago

Oh wait you said environment.yml, not my conda config, sorry. I misread that earlier. ...and you're spot on, that makes it work!

I used the channel order given in the bioconda channel setup instructions and equivalently what you described in #181, where these commands:

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

... make a YAML in the .condarc like:

channels:
  - conda-forge
  - bioconda
  - defaults

... which we can use to update the environment.yml as @louiejtaylor suggested above.

So this is really just a rehash of the earlier channel order problem. I think we need to reorder the channels in our own YAML in the repository to fix this, not just ~/.condarc.

I'm trying this out with our full set of channels and packages now to confirm for Sunbeam. Any thoughts? I'm still confused as to why channel priority can break dependency resolution so badly, even when packages are unambiguously in one channel and not others. The conda documentation on this seems to match my own understanding, where the priorities just come into play when a package could be found in multiple channels.

ressy commented 5 years ago

Well, I think we fixed one problem and uncovered another. The checksum on sunbeam_output/mapping/human/coverage.csv in the mapping test changed, I'm thinking because something in the floating point columns shifted a little with the latest package set. In retrospect using a checksum on those files was overly restrictive since we can't easily guarantee they'll be identical byte-by-byte . The bam file outputs still look like I expect, so I'm going to replace the checksum test with a simpler test on the sample column. So far so good.

https://circleci.com/gh/sunbeam-labs/sunbeam/149

louiejtaylor commented 5 years ago

Thanks for tracking this down, Jesse!

ressy commented 5 years ago

Thank you both!

On a side note I think I finally understand the full implications of the channel priority issue. In this case you get gsl=2.4 installed either way, but depending on channel priority you'll get it from either the defaults or from conda-forge, but only one of those two gsl builds brings along the blas libraries that rust-bio-tools actually needs. The gsl package name and version are the same either way, so only channel priority determines what indirect dependencies are pulled along.

So I do think we're now doing the right thing by following the bioconda instructions for channel priority in our YAML. Just to muddy the waters a little bit more, an Anaconda employee suggests defaults above everything else in ~/.condarc so you don't accidentally, say, install conda itself from conda-forge instead of defaults. But that seems like a separate issue from what we're dealing with at environment-setup-time here.

One last side note: conda-depgraph is helping me keep my sanity with these kind of things. I put it in its own environment and then can interrogate packages in any other environment to get an ASCII art dependency graph. It's very slick.

$ conda depgraph -n rbt-test-broken inout gsl
 ┌──────────────┐
 │rust-bio-tools│
 └──────┬───────┘
        │
        v
      ┌───┐
      │gsl│
      └─┬─┘
        │
        v
   ┌─────────┐
   │libgcc-ng│
   └─────────┘
$ conda depgraph -n rbt-test-working inout gsl
     ┌──────────────┐
     │rust-bio-tools│
     └──────┬───────┘
            │
            v
        ┌───────┐
        │  gsl  │
        └─┬─┬─┬─┘
          │ │ │
          │ └─┼────┐
          │   │    └──┐
          v   │       │
   ┌────────┐ │       │
   │libcblas│ │       │
   └────┬───┘ │       │
        │  ┌──┘       │
        │  │          │
        v  v          │
     ┌───────┐        │
     │libblas│        │
     └───┬───┘        │
         │            │
         v            │
    ┌────────┐        │
    │openblas│        │
    └──┬──┬──┘        │
       │  │           │
       │  └────────┐  │
       │           │  │
       v           v  v
 ┌───────────┐ ┌─────────┐
 │libgfortran│ │libgcc-ng│
 └───────────┘ └─────────┘