fedarko / strainFlye

Pipeline for analyzing (rare) mutations in metagenome-assembled genomes
BSD 3-Clause "New" or "Revised" License
8 stars 1 forks source link

Bizarre mamba installation error for a python 3.7.16 environment #70

Open fedarko opened 1 year ago

fedarko commented 1 year ago

Description of the problem

Try running the following:

mamba create -n very-sus "python = 3.7"

On my machine this creates a small environment with, as of writing, Python 3.7.16. So far so good.

Now let's try to install strainFlye into this environment:

conda activate very-sus
mamba install -c bioconda strainflye

This gives me a bizarre error:

...
Looking for: ['strainflye']

Encountered problems while solving.
Problem: nothing provides numpy 1.10* needed by scikit-bio-0.2.3-np110py27_0

This is confusing, because the very-sus environment doesn't contain numpy or scikit-bio. (Running conda list | grep numpy doesn't output anything; same for conda list | grep scikit or conda list | grep skbio.) So why does mamba assume we have to use such an ancient version of scikit-bio (and thus an ancient version of numpy)?

We can directly install a relatively recent version of scikit-bio without problems:

mamba install -c conda-forge "scikit-bio>=0.5.6"

On my machine this installs numpy 1.21.6 and scikit-bio 0.5.7. Again, that seems normal.

So you'd think that installing strainFlye should work now, right? But running mamba install -c bioconda strainflye at this point gives the following insane error message:

...
Looking for: ['strainflye']

Encountered problems while solving.
Problem: package strainflye-0.2.0-pyhca03a8a_0 requires python >=3.6,<3.8, but none of the providers can be installed

Now that's interesting, because we are literally using Python 3.7. To prove this, conda list | grep python gives:

brotli-python             1.0.9            py37hd23a5d3_7    conda-forge
ipython                   7.33.0           py37h89c1867_0    conda-forge
msgpack-python            1.0.3            py37h7cecad7_1    conda-forge
python                    3.7.16               h7a1cb2a_0  
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python_abi                3.7                     2_cp37m    conda-forge

Weird, right?

I think this is a symptom of a problem with mamba / conda -- this sort of obviously incorrect error message has been documented in a few other places (example 1, example 2, example 3).

Workaround solution

It seems like creating a conda environment with strainFlye in it (rather than installing strainFlye into an existing conda environment) works fine, though:

mamba create -n less-sus -c bioconda strainflye

This creates a working environment with strainFlye (and Python 3.7.6, and numpy 1.21.5, and scikit-bio 0.5.6) installed. Soooooo if this problem comes up again for users, they should be able to work around it by just creating a new environment. Eeesh.

fedarko commented 1 year ago

Another workaround solution

Interestingly, I was able to "kind of" install strainFlye into the very-sus environment from above by running:

mamba install -c bioconda strainflye "python=3.7"

This downgraded python from 3.7.16 to 3.7.13. But running strainflye from within this environment after this installation shows that the installation is broken:

Traceback (most recent call last):
  File "/home/marcus/anaconda3/envs/very-sus/bin/strainflye", line 7, in <module>
    from strainflye._cli import strainflye
  File "/home/marcus/anaconda3/envs/very-sus/lib/python3.7/site-packages/strainflye/_cli.py", line 5, in <module>
    from strainflye import (
  File "/home/marcus/anaconda3/envs/very-sus/lib/python3.7/site-packages/strainflye/align_utils.py", line 9, in <module>
    from . import graph_utils, fasta_utils, misc_utils, cli_utils, bam_utils
  File "/home/marcus/anaconda3/envs/very-sus/lib/python3.7/site-packages/strainflye/fasta_utils.py", line 4, in <module>
    import skbio
  File "/home/marcus/anaconda3/envs/very-sus/lib/python3.7/site-packages/skbio/__init__.py", line 11, in <module>
    import skbio.io  # noqa
  File "/home/marcus/anaconda3/envs/very-sus/lib/python3.7/site-packages/skbio/io/__init__.py", line 255, in <module>
    import_module('skbio.io.format.binary_dm')
  File "/home/marcus/anaconda3/envs/very-sus/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/home/marcus/anaconda3/envs/very-sus/lib/python3.7/site-packages/skbio/io/format/binary_dm.py", line 77, in <module>
    import h5py
  File "/home/marcus/anaconda3/envs/very-sus/lib/python3.7/site-packages/h5py/__init__.py", line 33, in <module>
    from . import version
  File "/home/marcus/anaconda3/envs/very-sus/lib/python3.7/site-packages/h5py/version.py", line 15, in <module>
    from . import h5 as _h5
  File "h5py/h5.pyx", line 1, in init h5py.h5
ImportError: /home/marcus/anaconda3/envs/very-sus/lib/python3.7/site-packages/h5py/defs.cpython-37m-x86_64-linux-gnu.so: undefined symbol: H5Pget_fapl_direct

This implies that something is wrong with scikit-bio 0.5.7, the version currently installed in the very-sus environment. We can force scikit-bio 0.5.6 by running:

mamba install -c conda-forge "scikit-bio=0.5.6"

And now strainFlye seems to work! Sheeeeesh.

So it seems like forcing the use of scikit-bio 0.5.6 should work. I think. Not sure why (maybe something in the scikit-bio 0.5.7 Cython changes broke something?), but it's nice to have another way to resolve these bizarre problems.