gmarcais / Jellyfish

A fast multi-threaded k-mer counter
Other
460 stars 136 forks source link

Why is the installed Python library different depending on build method? Re: bioconda #161

Open cerebis opened 4 years ago

cerebis commented 4 years ago

I'll start this off by saying I want to get the Jellyfish Python bindings into the Jellyfish bioconda package. If I can't get this to work, I'll either have to move my project to another kmer counter or possibly reinvent the wheel and create an external project to make JF bindings (probably using Pybind). If I begin there, I will have to assess whether JF is the best place to start. I really can't afford the time though!

I find the simplest method of creating Python bindings (./configure --enable-python-binding) to be problematic. In short, I get errors on import using this approach, whether I attempt import jellyfish or import dna_jellyfish.

The direct build approach as detailed in issue #134 works fine.

The import error is below. I haven't dived into SWIG to understand its process of loading dynamic libs. Seems like this might be as simple as a name or a search path issue?

>>> import dna_jellyfish
Traceback (most recent call last):
  File "swig/python/dna_jellyfish.py", line 20, in swig_import_helper
  File "/home/cerebis/miniconda3/envs/jf/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named '_dna_jellyfish'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "swig/python/dna_jellyfish.py", line 23, in <module>
  File "swig/python/dna_jellyfish.py", line 22, in swig_import_helper
  File "/home/cerebis/miniconda3/envs/jf/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named '_dna_jellyfish'

My question though is, why do the two methods generate different installation structures? Is one out of date with respect to the other?

The configure enable-binding method produces the following (import fails):

├── dna_jellyfish
│   ├── _dna_jellyfish.a
│   ├── _dna_jellyfish.so -> _dna_jellyfish.so.0.0.0
│   ├── _dna_jellyfish.so.0 -> _dna_jellyfish.so.0.0.0
│   ├── _dna_jellyfish.so.0.0.0
│   └── __init__.pyc
├── jellyfish.py
└── __pycache__
    └── jellyfish.cpython-38.pyc

While the direct build in the SWIG folder produces (import succeeds):

├── dna_jellyfish-0.0.1-py2.7.egg-info
├── dna_jellyfish.py
├── dna_jellyfish.pyc
└── _dna_jellyfish.so
brobr commented 4 years ago

I do not know anything about jellyfish, but looking at the above outputs; one looks like python-3.8 (jellyfish.cpython-38.pyc), while the other is python-2.7. So, if you work from within python-2 (which works) the python-3 import will fail.

the --enable-python-binding option will possibly take the python-version found by configure; while the direct method uses 'python setup.py' which would automatically call python-2 (I had to figure out the reverse; installing into python-3 where the --enable-python-binding would pick the python-2.7 tree on my box).

cerebis commented 4 years ago

Thanks. I should have seen that.

On Sun, 19 Apr 2020 at 1:25 am, brobr notifications@github.com wrote:

I do not know anything about jellyfish, but looking at the above outputs; one looks like python3 (jellyfish.cpython-38.pyc), while the other is python-2.7. So, if you work from within python-2(which works) the python3 import will fail.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/gmarcais/Jellyfish/issues/161#issuecomment-615889453, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABN2PC35DW2LFVYOGNJV6X3RNHA75ANCNFSM4LA2ETHQ .

feghalya commented 2 years ago

I have the same problem installing python bindings with the '--enable-python-binding'. I have checked and can confirm that jellyfish is installing the python bindings with the correct python version.

Similar to @cerebis, the only way I can get the dna_jellyfish python module to load correctly is by going to swig/python and installing manually.

The last jellyfish version which compiles and installs python bindings correctly is v2.2.6.

IBEXCluster commented 1 year ago

Dear @cerebis

Here are my steps for Jellyfish 2.3.0 (latest, as of today) Python 3.x binding:

Create a Python 3.x environment

$ curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
$ sh ./Miniconda3-latest-Linux-x86_64.sh -bufp /ibex/scratch/projects/c2072/work/jellyfish/python3.10
$ export PATH=/ibex/scratch/projects/c2072/work/jellyfish/python3.10/bin:$PATH
$ python --version 
Python 3.10.9 

Download the Jellyfish source code $ wget https://github.com/gmarcais/Jellyfish/releases/download/v2.3.0/jellyfish-2.3.0.tar.gz

Pre-request for source code compilation autoconf, automake, libool, gettext, pkg-config and yaggo. Most of the system may not have Yaggo, if so,just download as a binary from here:

$ wget https://github.com/gmarcais/yaggo/releases/download/v1.5.10/yaggo
$ chmod +x yaggo
$ export PATH=/ibex/scratch/projects/c2072/work/jellyfish:$PATH

Compile Jellyfish

$ tar -xzvf jellyfish-2.3.0.tar.gz
$ cd Jellyfish/
$ autoreconf -i
$ ./configure --prefix=/ibex/scratch/projects/c2072/work/jellyfish/install --enable-python-binding
$ make 
$ make install 

For Python binding, do the following:

$ cd swig/python/

Set the location of jellyfish-2.0.pc file, where it is available.

$ export PKG_CONFIG_PATH=/ibex/scratch/projects/c2072/work/jellyfish/install/lib/pkgconfig 
$ python ./setup.py build 
$ python ./setup.py install --prefix=/ibex/scratch/projects/c2072/work/jellyfish/install 

Set the Jellyflish Python binder PATH export PYTHONPATH=/ibex/scratch/projects/c2072/work/jellyfish/install/lib/python3.10/site-packages: /ibex/scratch/projects/c2072/work/jellyfish/install/lib/python3.10/site-packages/dna_jellyfish:$PYTHONPATH

Test the Jellyfish Python binding

$ python create_matrix.py --help 
usage: create_matrix.py [-h] -a ACCNAME -j JFDUMP -c CONFIG [-k KMERSIZE] [-mc MINCOUNT] [-o OUTPUT]

Parse jellyfish dump file to check presence/absence of k-mers in diversity panel.