hoffmangroup / genomedata

The Genomedata format for storing large-scale functional genomics data.
https://genomedata.hoffmanlab.org/
GNU General Public License v2.0
2 stars 1 forks source link

Anaconda distribution broken #46

Closed EricR86 closed 5 years ago

EricR86 commented 5 years ago

Original report (archived issue) by Kate Cook (Bitbucket: katecook).


Hi,

The version of genomedata on bioconda doesn't run as-is on my system. It's a problem with the version of path.py, although I don't know how the requirement of forked-path plays into this.

It looks like this was fixed a while ago in genomedata/init.py, but the version of genomedata on bioconda is 1.4.1. Any plans to update that soon? I could install from source, but I am lazy :)

Kate

#!bash

(python2) [hpc4366@caclogin02 ~]$ conda install genomedata
Fetching package metadata ...............
Solving package specifications: .

Package plan for installation in environment /global/home/hpc4366/.conda/envs/python2:

The following NEW packages will be INSTALLED:

    genomedata: 1.4.1-py27h470a237_2 bioconda

Proceed ([y]/n)? y

(python2) [hpc4366@caclogin02 ~]$ genomedata-load --version
Traceback (most recent call last):
  File "/global/home/hpc4366/.conda/envs/python2/bin/genomedata-load", line 7, in <module>
    from genomedata.load_genomedata import main
  File "/global/home/hpc4366/.conda/envs/python2/lib/python2.7/site-packages/genomedata/__init__.py", line 30, in <module>
    from path import path
ImportError: cannot import name path
EricR86 commented 5 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


Yep! The current Bioconda PR.

This is on the top of my todo list currently. Genomedata had some recent small improvements as well so the Bioconda PR is planning a newer Genomedata version and an updated conda recipe.

It's worth noting that in an empty conda environment, a 'pip install genomedata' should work just fine (in either python 2 or 3) as well. Currently only the conda recipe seems to be broken.

EricR86 commented 5 years ago

Original comment by Kate Cook (Bitbucket: katecook).


Great, thanks. P.S. I saw elsewhere that you were working on SLURM support for segway which would also be very helpful for me. Happy to test things out.

EricR86 commented 5 years ago

Original comment by Kate Cook (Bitbucket: katecook).


pip install genomedata doesn't work either:

#!shell
(python3) [hpc4366@caclogin02 dev]$ pip install genomedata
Looking in links: /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/avx2, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/generic
Collecting genomedata
  Using cached https://files.pythonhosted.org/packages/ae/a8/095d15019a28d370b0f81a5adb289828dee17b5ac5e069518024526996f6/genomedata-1.4.4.tar.gz
Requirement already satisfied: numpy in /global/home/hpc4366/.conda/envs/python3/lib/python3.6/site-packages (from genomedata) (1.15.0)
Requirement already satisfied: tables!=3.4.1,>=3.0 in /global/home/hpc4366/.conda/envs/python3/lib/python3.6/site-packages (from genomedata) (3.4.4)
Requirement already satisfied: six in /global/home/hpc4366/.conda/envs/python3/lib/python3.6/site-packages (from genomedata) (1.11.0)
Requirement already satisfied: textinput>=0.2.0 in /global/home/hpc4366/.conda/envs/python3/lib/python3.6/site-packages (from genomedata) (0.2.0)
Requirement already satisfied: path.py>=11 in /global/home/hpc4366/.conda/envs/python3/lib/python3.6/site-packages (from genomedata) (11.5.0)
Requirement already satisfied: numexpr>=2.5.2 in /global/home/hpc4366/.conda/envs/python3/lib/python3.6/site-packages (from tables!=3.4.1,>=3.0->genomedata) (2.6.8)
Requirement already satisfied: importlib-metadata>=0.5 in /global/home/hpc4366/.conda/envs/python3/lib/python3.6/site-packages (from path.py>=11->genomedata) (0.6)
Building wheels for collected packages: genomedata
  Running setup.py bdist_wheel for genomedata ... done
  Stored in directory: /global/home/hpc4366/.cache/pip/wheels/ca/cf/32/223bb7e516403ed63d3f492870accb7dc54cb601e3f745d3b6
Successfully built genomedata
Installing collected packages: genomedata
Successfully installed genomedata-1.4.4
(python3) [hpc4366@caclogin02 dev]$ genomedata-load --version
Traceback (most recent call last):
  File "/global/home/hpc4366/.conda/envs/python3/bin/genomedata-load", line 7, in <module>
    from genomedata.load_genomedata import main
  File "/global/home/hpc4366/.conda/envs/python3/lib/python3.6/site-packages/genomedata/__init__.py", line 28, in <module>
    import tables
  File "/global/home/hpc4366/.conda/envs/python3/lib/python3.6/site-packages/tables/__init__.py", line 93, in <module>
    from .utilsextension import (
ImportError: /lib64/libc.so.6: version `GLIBC_2.18' not found (required by /cvmfs/soft.computecanada.ca/nix/var/nix/profiles/gcc-5.4.0/lib64/libstdc++.so.6)

(I switched to python 3 here but python 2 fails as well with an even more cryptic error)

EricR86 commented 5 years ago

Original comment by Michael Hoffman (Bitbucket: hoffman, GitHub: michaelmhoffman).


That looks like a problem with your configuration unfortunately. The requirement of GLIBC_2.18 is coming from your system configuration. I would ask the Compute Canada helpdesk, they are fast.

EricR86 commented 5 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


Just a heads up since it indeed seems like a strange environment issue - there are two things note here:

  1. It looks like it's related spcifically to Pytables, since it crashes after an "import tables" and specifically importing one of its c-built extensions.
  2. I cannot reproduce this on Niagra (Compute Canada). The following works for me:
    $ module load anaconda3
    # Add bioconda channels if you have not already
    $ conda create -n python3 python=3 hdf5
    $ source activate python3
    $ pip install genomedata
    $ genomedata-load --version
EricR86 commented 5 years ago

Original comment by Kate Cook (Bitbucket: katecook).


  1. Do you have hdf5 installed somewhere else? That doesn't work for me without an explicit "module load hdf5". Anaconda won't modify the include path environment variables so gcc doesn't know to look for the anaconda-installed hdf5.

  2. I still get the GLIBC_2.18 error, so I'll ask cac support.

EricR86 commented 5 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


The line: $ conda create -n python3 python=3 hdf5 should install HDF5 in the conda environment named "python3". I did not install HDF5 is any other way or load it through a module.

I'd imagine there's likely some conflict between your loaded modules and the conda environment itself. I had no other modules explictly loaded except "anaconda3".

EricR86 commented 5 years ago

Original comment by Kate Cook (Bitbucket: katecook).


Anaconda doesn't set LD_LIBRARY_PATH or C_INCLUDE_PATH though. (I looked at this issue earlier: https://hoffmangroup.github.io/genomedata-bitbucket-backup/#!/hoffmanlab/genomedata/issues/15/install-fails-with-anaconda-python (#15) )

EricR86 commented 5 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


Genomedata 1.4.4 is now available through the Bioconda channel.

Most of the issues fixed were not from Genomedata itself but conda environment and build issues upstream:

All issues above have been fixed or worked around for the conda build. However we cannot support Python 3.5 since conda-forge will no longer include our upstream fixes. Also as mentioned above the current Bioconda build system does not build for Python 3.7. So unfortunately for Python 3 the only option available is Python 3.6 for installation. Do not use a conda environment with Python 3.5 if at all possible with Genomedata. It has some really subtle and nasty environment issues and will break.

cc @katecook