Closed guanxiangliang closed 5 years ago
@ArwaAbbas had the same problem--both of these are on PMACs. I can't reproduce it myself.
Possibly you were missing libcblas and then installing it directly fixed the problem, but it should have been brought in automatically to begin with, especially on a new install where I would think everything should start up to date. Do you have anything installed directly into your root conda environment (just do conda list
without being in an activated environment)? Checking the latest package metadata now, rust-bio-tools depends on gsl and gsl depends on libcblas but the rust-bio-tools recipe has changed a bit in recent weeks so maybe there was a lingering mismatch like we saw for openssl. EDIT: gsl only depends on libcblas for the host list,not run; maybe this is part of the trouble.
@guanxiangliang and/or @ArwaAbbas, are you still having trouble? I'd be interested to catch this in the act, if so.
No problem for me so far.
The only thing I did using conda before Sunbeam is conda install git
. Here the conda list
:
PMACs doesn't have git installed...I know Arwa also conda install
'd both git and wget, if it is relevant!
@ressy reproduced! On a "miniconda-naive" machine (microb191 after blowing away my sunbeam/miniconda3 dirs):
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
# restart terminal
git clone https://github.com/sunbeam-labs/sunbeam
cd sunbeam; ./install.sh
conda activate sunbeam
sunbeam init try_to_break_sunbeam --data_acc SRR1913936
sunbeam run -p --configfile try_to_break_sunbeam/sunbeam_config.yml all_decontam
The conda install git
was not important, as the error persists without it. I am currently testing to see whether the problem is still present with the Sunbeam-installed Miniconda (I don't really see why it wouldn't).
SRR1913936 is a tiny sample with 11 paired reads that I like to use for testing. Also, for these tests, I've just been removing the sunbeam and miniconda dirs, but the conclusions of this might be confounded if there are other places where packages are cached.
Edit: Same problem without a pre-installed miniconda3. Reproducible with:
git clone https://github.com/sunbeam-labs/sunbeam
cd sunbeam; ./install.sh
# restart terminal
conda activate sunbeam
sunbeam init try_to_break_sunbeam --data_acc SRR1913936
sunbeam run -p --configfile try_to_break_sunbeam/sunbeam_config.yml all_decontam
Edit 2:
Weirdly, this is not fixed by adding libcblas to environment.yml
.
Edit 3:
I should read more closely. The error for each of these attempts is actually not the same as the error Guanxiang and Arwa were getting, although it is also at the komplexity step. Mine is:
rbt: error while loading shared libraries: libopenblas.so.0: cannot open shared object file: No such file or directory
Installing with libcblas and libopenblas in environment.yml
fixes this issue (also works fine with libopenblas only). Adding stuff to the environment is an easy fix, but it seems like there's an upstream problem with rust-bio-tools that is causing this. Thoughts @ressy/@eclarke?
@louiejtaylor yep I can confirm the missing libopenblas library too, with a fresh install of everything in an isolated environment. I'll do some more digging. Thanks for spotting that.
I think this is another case of inconsistent dependencies.
In my test environment (where the whole conda setup is inside the working directory) I can see that libcblas and libopenblas are not handled via the environment when rust-bio-tools is installed. On this one particular server we happen to have a libcblas file from the OS but not libopenblas so we're just seeing the one problem here, but it looks like both could be an issue, I think.
$ ldd $(which rbt) | grep -v $(pwd)
linux-vdso.so.1 => (0x00007ffcfe6d3000)
libcblas.so.3 => /usr/lib/libcblas.so.3 (0x00007f9096837000)
libopenblas.so.0 => not found
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f909652e000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f9096164000)
/lib64/ld-linux-x86-64.so.2 (0x00007f9097634000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f9095f60000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f9095d58000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f9095b3b000)
libatlas.so.3 => /usr/lib/libatlas.so.3 (0x00007f909559d000)
It's flipped on a PMACS HPC compute node which I think explains the difference between your and Guanxiang's experience:
...
libcblas.so.3 => not found
libopenblas.so.0 => /lib64/libopenblas.so.0 (0x00002ae9c6e0d000)
...
Possibly this relates to the "host" versus "run" dependency lists in the recipe YAML files for both rust-bio-tools and for gsl, since they doesn't include any blas stuff for "run". I'm still a bit fuzzy on how the build/host/run split (edit: see also Anaconda's info) is supposed to work, but if those are libraries needed to run the compiled executable, they should be included on rust-bio-tools directly under the run list, right? More digging needed, I suppose!
I tried creating a really minimal example with just a rust-bio-tools install in an isolated environment, and whether it works depends on what channels are enabled and exactly which conda command is used.
With conda-forge and bioconda both enabled it actually does fine, pulling in the libraries via conda-forge. With just bioconda it lets you install a broken rbt binary since it leaves out the libraries it would otherwise get from conda-forge. With all four channels we have explicitly listed at install time enabled it also works, oddly enough. And then if I use the environment file approach the Sunbeam installer uses with conda env update
, which I would think should be the same as listing all channels in a conda install
or conda create
, it breaks.
So I have two separate points to figure out for rust-bio-tools:
conda env update
approach not install the right dependencies when similar install/create commands do?Thank you for tracking this down! Does reordering the conda channels in environment.yml
as in #181 have any effect?
Oh yeah, good point, since we're on the topic of channels. I don't actually have a ~/.condarc in this case, though, (with $HOME
as a temporary directory) so it's starting from complete defaults. Sunbeam provides its own list in the .yml file, and for other commands I'm entering -c channel1 -c channel2 etc. to test... but no, it doesn't seem to make a difference when I swap them around here.
Here's an interesting comparison with two different environment.yml files but the same channels and order defined.
Running conda env update --name rbt-test --file environment.yml
with this file works:
channels:
- bioconda
- conda-forge
dependencies:
- rust-bio-tools
And this one breaks:
channels:
- bioconda
- conda-forge
dependencies:
- biopython
- rust-bio-tools
So what the heck happens to my dependency resolution when biopython is added in? I should have more time to figure that out this afternoon I think.
(Edit: Also, I tested this by removing biopython from the Sunbeam environment.yml but leaving everything else, just to confirm that it's the presence/absence of the biopython entry. Same deal.)
Oh wait you said environment.yml
, not my conda config, sorry. I misread that earlier. ...and you're spot on, that makes it work!
I used the channel order given in the bioconda channel setup instructions and equivalently what you described in #181, where these commands:
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
... make a YAML in the .condarc like:
channels:
- conda-forge
- bioconda
- defaults
... which we can use to update the environment.yml
as @louiejtaylor suggested above.
So this is really just a rehash of the earlier channel order problem. I think we need to reorder the channels in our own YAML in the repository to fix this, not just ~/.condarc.
I'm trying this out with our full set of channels and packages now to confirm for Sunbeam. Any thoughts? I'm still confused as to why channel priority can break dependency resolution so badly, even when packages are unambiguously in one channel and not others. The conda documentation on this seems to match my own understanding, where the priorities just come into play when a package could be found in multiple channels.
Well, I think we fixed one problem and uncovered another. The checksum on sunbeam_output/mapping/human/coverage.csv
in the mapping test changed, I'm thinking because something in the floating point columns shifted a little with the latest package set. In retrospect using a checksum on those files was overly restrictive since we can't easily guarantee they'll be identical byte-by-byte . The bam file outputs still look like I expect, so I'm going to replace the checksum test with a simpler test on the sample column. So far so good.
Thanks for tracking this down, Jesse!
Thank you both!
On a side note I think I finally understand the full implications of the channel priority issue. In this case you get gsl=2.4 installed either way, but depending on channel priority you'll get it from either the defaults or from conda-forge, but only one of those two gsl builds brings along the blas libraries that rust-bio-tools actually needs. The gsl package name and version are the same either way, so only channel priority determines what indirect dependencies are pulled along.
So I do think we're now doing the right thing by following the bioconda instructions for channel priority in our YAML. Just to muddy the waters a little bit more, an Anaconda employee suggests defaults above everything else in ~/.condarc so you don't accidentally, say, install conda itself from conda-forge instead of defaults. But that seems like a separate issue from what we're dealing with at environment-setup-time here.
One last side note: conda-depgraph is helping me keep my sanity with these kind of things. I put it in its own environment and then can interrogate packages in any other environment to get an ASCII art dependency graph. It's very slick.
$ conda depgraph -n rbt-test-broken inout gsl
┌──────────────┐
│rust-bio-tools│
└──────┬───────┘
│
v
┌───┐
│gsl│
└─┬─┘
│
v
┌─────────┐
│libgcc-ng│
└─────────┘
$ conda depgraph -n rbt-test-working inout gsl
┌──────────────┐
│rust-bio-tools│
└──────┬───────┘
│
v
┌───────┐
│ gsl │
└─┬─┬─┬─┘
│ │ │
│ └─┼────┐
│ │ └──┐
v │ │
┌────────┐ │ │
│libcblas│ │ │
└────┬───┘ │ │
│ ┌──┘ │
│ │ │
v v │
┌───────┐ │
│libblas│ │
└───┬───┘ │
│ │
v │
┌────────┐ │
│openblas│ │
└──┬──┬──┘ │
│ │ │
│ └────────┐ │
│ │ │
v v v
┌───────────┐ ┌─────────┐
│libgfortran│ │libgcc-ng│
└───────────┘ └─────────┘
I had this problem today with a brand new installation of Sunbeam without installing anything else:
[Tue Apr 16 10:37:53 2019] rbt: error while loading shared libraries: libcblas.so.3: cannot open shared object file: No such file or directory [Tue Apr 16 10:37:53 2019] Error in rule remove_low_complexity:
@louiejtaylor and I tried
conda install --force-reinstall rust-bio-tools -c bioconda
did not fix it.However,
conda install --force-reinstall libcblas -c conda-forge
did fix it.