Martinsos / edlib

Lightweight, super fast C/C++ (& Python) library for sequence alignment using edit (Levenshtein) distance.
http://martinsos.github.io/edlib
MIT License
505 stars 165 forks source link

edlib import issues for conda python 3.8 and above? #151

Closed ayaanhossain closed 3 years ago

ayaanhossain commented 4 years ago

Hi Martin,

Apparently edlib installation on Python 3.8 cannot be imported correctly. Here's some information.

Installed edlib

ayaan 🠚 pip install edlib
Processing /home/ayaan/.cache/pip/wheels/52/bd/a7/db5cc4d316d0cde20fe580aeebbf4405f3428e576ea9ad3013/edlib-1.3.8.post1-cp38-cp38-linux_x86_64.whl
Installing collected packages: edlib
Successfully installed edlib-1.3.8.post1

Tried importing edlib

ayaan 🠚 python
Python 3.8.2 | packaged by conda-forge | (default, Apr 24 2020, 08:20:52) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import edlib
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/ayaan/miniconda2/envs/py3.8/lib/python3.8/site-packages/edlib.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNSs4_Rep20_S_empty_rep_storageE
>>> 

What do you think is going on here? How can I fix this? My installation of edlib works correctly for Python 2.7 and 3.7. Please let me know if you need more information.

Martinsos commented 4 years ago

Hmm, I just tried it on my machine, with Python 3.8.2, and it all works fine. But to be very exact, I did

sudo pip install edlib
python   # Which entered REPL
import edlib

Info in REPL:

Python 3.8.2 (default, Feb 26 2020, 22:21:03) 
[GCC 9.2.1 20200130] on linux

So this works fine, I could do an import and use edlib.

However, I see that you are using Python 3.8.2 packaged by conda-forge, so I am guessing your problem has something to do with that. I myself don't use conda so I don't know much about it all, but I found this link, maybe this is relevant: https://github.com/lucasb-eyer/pydensecrf/issues/61 ? It suggests that problem might be in mixing conda python and non-conda python stuff.

If you can try it with system python first and you get it working, that would confirm that problem is indeed conda related.

ayaanhossain commented 4 years ago

You're right. This problem could be conda specific.

I manually compiled and installed Python 3.8.2 on my system (my system default for python3 is Python 3.5.x), and managed to install edlib correctly from pip.

ayaan 🠚 sudo python3.8 -m pip install --no-cache-dir edlib
Collecting edlib
  Downloading https://files.pythonhosted.org/packages/b5/7c/b8947395b2259d9c5c89b660226791c9424bb909677fab46daea9129d8af/edlib-1.3.8.post1.tar.gz (93kB)
     |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 102kB 935kB/s 
Installing collected packages: edlib
  Running setup.py install for edlib ... done
Successfully installed edlib-1.3.8.post1
WARNING: You are using pip version 19.2.3, however version 20.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

Import also works correctly on Python 3.8.2.

ayaan 🠚 python3.8
Python 3.8.2 (default, May  6 2020, 05:56:53) 
[GCC 4.9.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import edlib
>>> quit()

I will shortly check this whole bit with conda-forge and update accordingly.

ayaanhossain commented 4 years ago

Okay, so reinstalling edlib via pip with my conda-forge Python 3.8.2 doesn't seem to work. I tried installing python-edlib via bioconda but unfortunately, the bioconda recipe for python-edlib is restricted to Python 3.7 and below. Here's some log.

(py3.8) [~]
ayaan 🠚 conda install -c bioconda python-edlib
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: / 
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
failed                                                                            - 

UnsatisfiableError: The following specifications were found
to be incompatible with the existing python installation in your environment:

Specifications:

  - python-edlib -> python[version='2.7.*|>=3.5,<3.6.0a0|>=2.7,<2.8.0a0|>=3.6,<3.7.0a0|3.6.*|3.5.*|>=3.7,<3.8.0a0']

Your python: python=3.8

If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.
Martinsos commented 4 years ago

Ok, great to hear that! I wish I could help you with conda, but as I said I don't use it myself.

There is however this on bioconda: https://anaconda.org/bioconda/python-edlib -> I have not created this myself though, but it might be ok?

On Wed, May 6, 2020 at 12:13 PM Ayaan Hossain notifications@github.com wrote:

You're right. This problem could be conda specific.

I manually compiled and installed Python 3.8.2 on my system (my system default for python3 is Python 3.5.x), and managed to install edlib correctly from pip.

ayaan 🠚 sudo python3.8 -m pip install --no-cache-dir edlib

Collecting edlib

Downloading https://files.pythonhosted.org/packages/b5/7c/b8947395b2259d9c5c89b660226791c9424bb909677fab46daea9129d8af/edlib-1.3.8.post1.tar.gz (93kB)

 |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 102kB 935kB/s

Installing collected packages: edlib

Running setup.py install for edlib ... done

Successfully installed edlib-1.3.8.post1

WARNING: You are using pip version 19.2.3, however version 20.1 is available.

You should consider upgrading via the 'pip install --upgrade pip' command.

Import also, works correctly on Python 3.8.2.

ayaan 🠚 python3.8

Python 3.8.2 (default, May 6 2020, 05:56:53)

[GCC 4.9.4] on linux

Type "help", "copyright", "credits" or "license" for more information.

import edlib

quit()

I will shortly check this whole bit with conda-forge and update accordingly.

β€” You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Martinsos/edlib/issues/151#issuecomment-624560344, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALXFB6O533VMRWV3G3A4DLRQEZ3DANCNFSM4MZMCR4Q .

Martinsos commented 4 years ago

Aha, ok this is good to know, that it does not work in such combination. I unfortunately don't have the time to research this and fix it, as it would take me quite some time to get into the whole conda ecosystem. If you want, we can leave it as a bug, and maybe you or somebody else can figure it out and fix it.

On Wed, May 6, 2020 at 12:19 PM Ayaan Hossain notifications@github.com wrote:

Okay, so reinstalling edlib via pip with my conda-forge Python 3.8.2 doesn't seem to work. I tried installing python-edlib via bioconda but unfortunately, the bioconda recipe for python-edlib is restricted to Python 3.7 and below. Here's a some log.

(py3.8) [~]

ayaan 🠚 conda install -c bioconda python-edlib

Collecting package metadata (current_repodata.json): done

Solving environment: failed with initial frozen solve. Retrying with flexible solve.

Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.

Collecting package metadata (repodata.json): done

Solving environment: failed with initial frozen solve. Retrying with flexible solve.

Solving environment: /

Found conflicts! Looking for incompatible packages.

This can take several minutes. Press CTRL-C to abort.

failed -

UnsatisfiableError: The following specifications were found

to be incompatible with the existing python installation in your environment:

Specifications:

  • python-edlib -> python[version='2.7.|>=3.5,<3.6.0a0|>=2.7,<2.8.0a0|>=3.6,<3.7.0a0|3.6.|3.5.*|>=3.7,<3.8.0a0']

Your python: python=3.8

If python is on the left-most side of the chain, that's the version you've asked for.

When python appears to the right, that indicates that the thing on the left is somehow

not available for the python version you are constrained to. Note that conda will not

change your python version to a different minor version unless you explicitly specify

that.

β€” You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Martinsos/edlib/issues/151#issuecomment-624563045, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALXFB2ZILIUEI6AMJFACTLRQE2R5ANCNFSM4MZMCR4Q .

ayaanhossain commented 4 years ago

Yeah, apparently, the conda recipe for python-edlib is not compatible with Python 3.8. You see, a month or so back I actually tried installing edlib via the bioconda channel, but it showed this version error. I tried pip install edlib as well back then and had this import issue. What I didn't realize was this could be a conda-forge issue, since all of my attempts were with conda Python 2.7.x and 3.x? I simply thought edlib has no Python 3.8 support, but in fact it's not true since my system Python 3.8 is able to work with edlib-1.3.8. Whoever is in charge of the bioconda recipe could perhaps upgrade it? I too am unfamiliar with bioconda recipes, but if I get some time, I will research this as well.

Bottomline, edlib-1.3.8 works in Python 3.8, and I can use it for my work and list it as a dependency. I will now close this issue. Many thanks for looking into this. edlib is beautiful.

ayaanhossain commented 4 years ago

Just a followup on this thread. I found one can install edlib from PyPI and use it successfully inside a conda environment containing Python > 3.7 after installing the cxx-compiler package from conda prior to installing edlib.

Step 1. Create and/or activate the target environment.

$ conda create -n myenv python=3.8
$ conda activate myenv

Step 2. Install cxx-compiler from conda-forge channel.

$ conda install -c conda-forge cxx-compiler

Step 3. Install edlib from PyPI.

$ pip install --upgrade edlib --no-cache-dir

Step 4. Verify edlib installation.

$ python
Python 3.8.5 | packaged by conda-forge | (default, Jul 31 2020, 02:39:48) 
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import edlib
>>> edlib.align('Armour', 'Amour', task='path')
{'editDistance': 1, 'alphabetLength': 5, 'locations': [(0, 4)], 'cigar': '1=1I4='}
>>> quit()

Python 3.8 and above as packaged by conda-forge does not compile edlib from PyPI successfully, although installation shows no errors (only import does as discussed above). This problem does not exist in conda environments with Python 3.7 or below.

Martinsos commented 4 years ago

@ayaanhossain thank you for investigating this further! Interesting, I wonder what change is causing it to not work with python 3.8 but work with python 3.7.

As you mentioned before, it seems problem might be with the bioconda recipe, and you even seem to have found a possible solution, so I created an issue at bioconda asking for help with possibly fixing the recipe: https://github.com/bioconda/bioconda-recipes/issues/24071 .

Martinsos commented 4 years ago

Allegedly this has been fixed on the bioconda side, as mentioned here https://github.com/bioconda/bioconda-recipes/issues/24071 !

ayaanhossain commented 4 years ago

Awesome! I can confirm that, python-edlib can be installed via conda install -c bioconda python-edlib on python=3.8, and it works now. But installation from PyPI still requires cxx-compiler from conda-forge to get it working. I still think the safest way to get the latest version of edlib in conda is via the cxx-compiler+pip install --upgrade edlib --no-cache-dir.

The thing is, whenever a python project needs to install a dependency, if the dependency is available in PyPI, then it should automatically be taken care of by specifying it inside setup.py (which installs the dependencies from PyPI when the said project is installed from PyPI) or requirements.txt (which requires the user to install dependencies from PyPI via pip install -r requirements.txt prior to installing the project either from GitHub or PyPI). In case of edlib, one has to ask the user to go via the conda install -c bioconda python-edlib or the cxx-compiler+pip install ... route if the environment is conda (which it is for most bioinformaticians I guess). This is fine, but it would be better if this whole process was as easy as specifying numpy, or scikit-learn as a requirement both of which can be installed using pip pretty easily regardless of the environment. Unfortunately, conda has indeed become quite cumbersome (recall python=3.7 installs edlib from PyPI absolutely fine, but python=3.8 doesn't -- boggles the mind!).

Martinsos commented 4 years ago

@ayaanhossain , I am not using Python much lately and I have never used Bioconda so I have to admit while I am getting the basic grasp of what is happenning here, some details I am certainly not understanding :). But it is great that you are pointing to the problems, pls help me figure them out a little bit more and let's do smth about it!

I understand that installation process for Edlib is still more complicated than one would like it to be, on bioconda, correct? What about if we install it directly via pip -> then it is fine, right? At least it seems to be fine for me, I just tried it and had to do no extra steps, just install edlib.

If that is so, I guess you are suggesting that we find a way to make bioconda install simpler, right? Any suggestions there on how to do it? Would you be ok with opening an issue on bioconda repo, where they are maintaining Edlib recipe? Maybe I am rushing here, I should probably understand better first what the exact problem is at the moment.

Thanks!

Martinsos commented 4 years ago

@ayaanhossain maybe I am confusing conda and bioconda? Also, when you say installation from PyPI -> does that still have to do something with conda, or not, why is conda-forge mentioned hm? I am afraid you will have to explain this to me like to the little child. I know nothing about conda/bioconda, all I know is there is PyPI which has tool called pip which can be used to install and manage packages.

ayaanhossain commented 4 years ago

Hi @Martinsos! Okay, here's what I understand about the present situation, and how things look to me.

You have a project edlib, written in C/C++, which does sequence alignment. The project is awesome, and is perhaps the best way to do fast sequence alignment operation to my knowledge. I and many other bioinformatics scholars / scientists like to use edlib for sequence alignment in our data processing pipelines or bioinformatics programs, but we're generally developing our software in python. Key point here is that edlib is a dependency for our projects, and users need to be able to install edlib easily in order to use our projects.

Let's assume I'm developing project-x which I'm designing for use in python=3.6 and above, and it uses edlib for all its alignment tasks. To develop and test project-x, I need multiple versions of python installed on my system. To do this, I have installed conda / miniconda on my development machine. conda allows me to create multiple environments, each with a different python version inside them, so I can (de-)activate different environments on-the-fly, to develop and test my code for different python versions.

There are alternatives like python virtual environments (venv) which perhaps offer similar functionality, but conda is preferred because we can create blank environments, and install lots of different software in them, not just python stuff. For example, I have a conda environment called NGS in which I have installed fastqc, bwa and samtools (from bioconda) all of which are written in languages other than python, and I use these software to do some basic sequencing data analysis. I also have numpy, matplotlib and jupyter, which are python packages (from PyPI), and python=3.7 (from conda-forge), installed inside this NGS environment, to do custom analysis.

START OF DETOUR

Key-concepts.

conda related:

You can verify these concepts in more detail here.

If you want to try using conda, you can use the cheat sheet.

Combined example: You wrote edlib in C/C++. Someone else developed a recipe called python-edlib that can be used by a user to install edlib inside an environment such that the user is able to call edlib-functions from inside the python installed also within the same environment. The recipe is hosted at the bioconda channel.

Miniconda and Anaconda related.

These are distributions which you can install. They install the conda system and install some packages for the user. Anaconda comes with conda system and a ton of (unnecessary) packages, and a version of python. Miniconda just installs conda system mostly. If you have either Anaconda or Miniconda installed, you can start using conda to create environments and use it for developing, testing and using software.

PyPI related.

conda-pip relationship: Typically, for simple cases one should be able to pip install SomePackageName (observe no --user or other flags used) inside a conda-environment and use it without errors. For example, you can reliably do this for numpy and scipy among others, as opposed to say, fetching these packages from a channel like conda-forge.

More on their relationship: (1) https://www.anaconda.com/blog/understanding-conda-and-pip (2) https://www.anaconda.com/blog/using-pip-in-a-conda-environment (3) https://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/

END OF DETOUR

Now, coming to the situation with edlib. You have a setup.py that can be used to install edlib extensions for usage from within python. You host the edlib source and this setup.py at pypi.org. All a user needs to do then is pip install edlib and it should install edlib from PyPI and make it available to the user from within python. The problem, currently, is that this pip install edlib installs edlib successfully inside a conda environment with python=3.8, but when import-ed, throws an undefined symbol error. This error does not happen with an equivalent environment with python=3.7.

Specifically, the error is undefined symbol: _ZNSs4_Rep20_S_empty_rep_storageE. From Googling around, I came across these SO posts.

(1) https://stackoverflow.com/questions/11643666/python-importerror-undefined-symbol-for-custom-c-module (2) https://stackoverflow.com/questions/47624829/custom-python-extension-import-error-undefined-symbol (3) https://stackoverflow.com/questions/51273418/import-error-undefined-symbol-znk9fastnoise8getnoiseeff-when-calling-c-ext (4) https://stackoverflow.com/questions/42364197/import-error-undefined-symbol-c-module-in-python-ztinst8ios-base7failureb5cx (5) https://stackoverflow.com/questions/56841420/python-c-extension-on-import-i-get-an-undefined-symbol-error (6) https://stackoverflow.com/questions/60619799/undefined-symbol-error-with-embedded-python-interpreter

To me (I know basic C from my undergrad, but I've never programmed in C++), this error looks like in python=3.8 based conda environments, there is a linking issue that edlib is facing. Specifically, the error looks like emerging from libstdc++'s basic_string.h.

Read more here: (1) https://reverseengineering.stackexchange.com/questions/10722/what-is-s-empty-rep-storage-used-for-in-this-code (2) https://stackoverflow.com/questions/57691078/shared-library-crash-when-project-compiled-with-o1-optimization-flag (3) http://gcc.1065356.n8.nabble.com/How-does-ZNSs4-Rep20-S-empty-rep-storageE-not-become-a-unique-global-symbol-td915504.html

When I realized this, I installed cxx-compiler inside the conda-environment, and then used pip install --upgrade edlib --no-cache-dir, which installed edlib and the import no longer threw the error.

I found this guide on building C++ extensions for python using pybind11, that you can look into as well, which ultimately may be a better way of building C++ extensions for python: https://www.benjack.io/2018/02/02/python-cpp-revisited.html

So, what are my final thoughts?

I prefer edlib from PyPI. I want to make it a dependency for project-x, which I plan to be a PyPI package, and usable within both conda and non-conda environments.

To that end, ideally I would like to specify edlib inside the install_requires argument inside my setup function inside my setup.py, so that a user simply has to run pip install project-x to get project-x and all its dependencies installed (pip would automatically install project-x along with all python dependency packages from names listed in install_requires argument including edlib, numpy etc. also from PyPI -- so that the user does not have to run pip install numpy, followed by pip install edlib etc. pior to installing project-x via pip install project-x). This ultimately means that there is low overhead for users installing and using project-x.

With present situation, I can specify edlib in install_requires inside my setup.py, and this would work fine for non-conda-environments and with conda-environments with up to python=3.7, but inside conda-environments with python=3.8 and possibly above (recall, python=3.9 will be rolled out in near future), pip install project-x would install fine, but crash during import or use (via edlib's import error), which is of course, not desirable.

So, I have two options.

OPTION 1: I would have to ask conda-users to conda install -c bioconda python-edlib, and non-conda users to pip install edlib prior to installing project-x via pip install project-x, thus essentially removing edlib from install_requires altogether.

OPTION 2: Alternatively, I can retain edlib in install_requires inside my setup.py, and ask conda-users to conda install cxx-compiler prior to installing project-x via pip install project-x. Non-conda users won't have to do anything, and everything would be done automatically by setup.py.

Of course, OPTION 2 looks best for me, as of now.

Now coming to your specific questions:

Q. I understand that installation process for Edlib is still more complicated than one would like it to be, on bioconda, correct? What about if we install it directly via pip -> then it is fine, right? At least it seems to be fine for me, I just tried it and had to do no extra steps, just install edlib.

A. The installation process for bioconda is not complicated. You can install the bioconda version easily via conda install -c bioconda edlib. If you install from pip and you're not on conda, you're fine. So what're the problems? Well, previously there were two problems - (1) bioconda version did not support envs with python=3.8, and (2) on conda environments with python=3.8 the pip installed version would not import properly. Recently, the first problem is fixed. But the second problem still remains. I personally prefer installing edlib from PyPI using pip irrespective of whether I'm on conda or not, because that's the version for python-users that you maintain, and it is guaranteed to be the latest one, with all bugs fixed. Plus, the packages I write are pure python packages, and as such if edlib can be installed in all situations via pip, then it simplifies life for me and others who are using edlib as a dependency for their projects. Otherwise, special situations related to edlib become special situations for our projects as well.

Q. If that is so, I guess you are suggesting that we find a way to make bioconda install simpler, right? Any suggestions there on how to do it? Would you be ok with opening an issue on bioconda repo, where they are maintaining Edlib recipe? Maybe I am rushing here, I should probably understand better first what the exact problem is at the moment.

A. bioconda is kind of irrelevant here. The recipe is from a third party person, and they may not always be in a position to maintain it. My main point is that if pip install edlib works for python=3.7 based conda-environments, and on py=3.8 envs after cxx-compiler installation, then something is either weird with edlib from PyPI, that is resulting in weird compilation for py=3.8 on conda or the default compiler used to compile edlib in conda + py=3.8 prior to installing cxx-compiler is problematic. As discussed above, ultimately might be a linking issue from the default compiler. To reproduce, you will need to install Miniconda, create an environment with python=3.8, activate it and try pip install edlib. It'll install correctly, but you'll have issues importing it. You'll be able to fix it after you install cxx-compiler and then re-install edlib inside that environment.

Q. maybe I am confusing conda and bioconda? Also, when you say installation from PyPI -> does that still have to do something with conda, or not, why is conda-forge mentioned hm?

A. I've briefly explained some of the concepts above, and provided links from the makers of conda, which I hope you find useful. I specifically mean, installing edlib from PyPI inside a conda environment with python=3.8 -- this is the only issue at the moment. The python build is from conda-forge channel. But, again, python=3.7 from conda-forge does not show this issue.

I'm happy to answer any questions you might have. Sorry, this was a bit of a long post, but I hope you find this useful in understanding the situation at hand.

Martinsos commented 4 years ago

@ayaanhossain thank you this is amazing!!! It took me some a while to find enough time to read it :D, sorry about that - it really explained all the concepts, this is awesome. You should make a blog post out of this.

Ok, so besides me much better understanding now the relation between bioconda, conda, python, pip and so on, the final thought is, as you mentioned multiple times: pip install edlib works in conda environment where python is < 3.8, but not if python >= 3.8. In that case installing cxx-compiler beforehand is needed. We want to figure out why is this so and modify setup.py so that it works in all situations, ideally. Additional point though: Edlib works for python 3.8 when used with it directly, it does not however work in conda environment that uses python 3.8.

As you said, there seems to be problem with some CPP libraries missing in this specific conda environment. While I use CPP for edlib I haven't used it as main language for some time so I my building/compiling knowledge is not where I would like it to be, so I don't have any immediate insights. Main ideas that come to me are, regarding how to proceed with this:

I also have to further study the specific error that you described, it might give us some more ideas. I am reopening this issue, because I think this is something we should fix!

Very wild guess: Maybe it would be enough to add libraries=["stdc++"] to Extension( part in setup.py (based on this https://stackoverflow.com/questions/11939934/c-apache-module-fails-on-znss4-rep20-s-empty-rep-storagee) -> but I only half-understand what they are talking about here, so it more of a direction that likely final solution, I don't really understand why this has to be defined (because different compiler is used in that specific conda environment? gcc or ld directly instead of g++?).

Or, maybe it is not built using setup.py when using pip install, from source, and instead it is using dynamically built binary? In that case that is the problem, and I should maybe aim to build it statically? Ok I really have to better figure out what I am uploading to PyPI and what is pip install edlib actually doing, does it build something or just downloads prebuilt binary or what.

Martinsos commented 4 years ago

@ayaanhossain , I just installed miniconda (on Archlinux) and I created new environment, which is python 3.8.5 environment if I am correct (that was chosen as default, I did not specify python version).

I entered env, run pip install edlib, opened python interpreter, imported edlib and run edlib.align() and it worked!

I then removed edlib with pip uninstall edlib and run conda install -C bioconda python-edlib, and I got following error:

Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

  - python-edlib
  - bioconda

Current channels:

  - https://repo.anaconda.com/pkgs/main/linux-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/linux-64
  - https://repo.anaconda.com/pkgs/r/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

Then I added conda-forge and bioconda to channels, and while conda install -c bioconda python-edlib still did not work, conda install -c bioconda/label/cf201901 python-edlib did. And again edlib worked fine, I managed to import it into python interpreter and even run edlib.align().

Version of conda that I have is 4.8.3.

Martinsos commented 4 years ago

Regarding my questions about pip build from source vs using dynamic binary: I found here https://packaging.python.org/tutorials/installing-packages/#source-distributions-vs-wheels that pip will use wheel (which is their name for precompiled binary + some stuff) if there is one available, otherwise it will compile from source. Also, it will reuse local wheel if there is one already on the local machine from the previous local build (which happened from source).

Looking at my package on PyPI, I see that I never built and uploaded a wheel, which means if somebody is installing edlib via pip, that is certainly happening from source.

Command for removing local wheel is pip cache remove edlib. I did following then:

pip uninstall edlib
pip cache remove edlib
pip install --verbose edlib

And this installs from source + you can see the actual compilation commands.

I did this sequence of command in the conda env I created, and I could see the compilation commands and all, and edlib again works.

@ayaanhossain , what might help us solve the whole thing might be if you could do the same thing and capture the compile commands used to build edlib on your side, my bet is that something is different (different compiler?), causing that problem with linking that you mentioned. Although, it should be all the same if we are using conda right? Or is conda using some stuff from the host system?

Here are compile commands that were printed in my case:

gcc -pthread -B /home/martin/.conda/envs/myenv/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Iedlib/include -I/home/martin/.conda/envs/myenv/include/python3.8 -c edlib.bycython.cpp -o build/temp.linux-x86_64-3.8/edlib.bycython.o -O3 -std=c++11
cc1plus: warning: command-line option β€˜-Wstrict-prototypes’ is valid for C/ObjC but not for C++
gcc -pthread -B /home/martin/.conda/envs/myenv/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Iedlib/include -I/home/martin/.conda/envs/myenv/include/python3.8 -c edlib/src/edlib.cpp -o build/temp.linux-x86_64-3.8/edlib/src/edlib.o -O3 -std=c++11
cc1plus: warning: command-line option β€˜-Wstrict-prototypes’ is valid for C/ObjC but not for C++
creating build/lib.linux-x86_64-3.8
g++ -pthread -shared -B /home/martin/.conda/envs/myenv/compiler_compat -L/home/martin/.conda/envs/myenv/lib -Wl,-rpath=/home/martin/.conda/envs/myenv/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.8/edlib.bycython.o build/temp.linux-x86_64-3.8/edlib/src/edlib.o -o build/lib.linux-x86_64-3.8/edlib.cpython-38-x86_64-linux-gnu.so

Output of entering python interpreter in my env:

Python 3.8.5 (default, Sep  4 2020, 07:30:14) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.

Output of my conda info:

     active environment : myenv
    active env location : /home/martin/.conda/envs/myenv
            shell level : 1
       user config file : /home/martin/.condarc
 populated config files : /home/martin/.condarc
          conda version : 4.8.3
    conda-build version : not installed
         python version : 3.8.3.final.0
       virtual packages : __glibc=2.32
       base environment : /opt/miniconda3  (read only)
           channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
                          https://conda.anaconda.org/conda-forge/linux-64
                          https://conda.anaconda.org/conda-forge/noarch
                          https://conda.anaconda.org/bioconda/linux-64
                          https://conda.anaconda.org/bioconda/noarch
          package cache : /opt/miniconda3/pkgs
                          /home/martin/.conda/pkgs
       envs directories : /home/martin/.conda/envs
                          /opt/miniconda3/envs
               platform : linux-64
             user-agent : conda/4.8.3 requests/2.23.0 CPython/3.8.3 Linux/5.8.7-arch1-1 arch/rolling glibc/2.32
                UID:GID : 1000:1000
             netrc file : None
           offline mode : False
cjw85 commented 3 years ago

@Martinsos

Getting a bit off topic from the original thread, but since you mentioned python wheels in your last response... have you thought about uploading python wheels? Given that you have already a successfully working source distribution for python (the "sdist") it shouldn't be difficult to create wheels, e.g. by following https://github.com/pypa/python-manylinux-demo, or even https://github.com/joerick/cibuildwheel.

I might have time to provide a PR if this is something you do not have time for.

https://github.com/nanoporetech/medaka/issues/209

Martinsos commented 3 years ago

@Martinsos

Getting a bit off topic from the original thread, but since you mentioned python wheels in your last response... have you thought about uploading python wheels? Given that you have already a successfully working source distribution for python (the "sdist") it shouldn't be difficult to create wheels, e.g. by following https://github.com/pypa/python-manylinux-demo, or even https://github.com/joerick/cibuildwheel.

I might have time to provide a PR if this is something you do not have time for.

nanoporetech/medaka#209

Hi @cjw85 , sure, thanks for taking the initiative! Let's continue the discussion about wheels in the PR :).

ayaanhossain commented 3 years ago

Hi Martin, I am extremely sorry for vanishing for the last couple of months from this thread. But I have since changed my computer, re-installed everything, and today I tested edlib on conda environments with both 3.8 and 3.9, it all works like a charm, no issues anywhere at all! Either you guys fixed something upstream, or something was changed by the folks at conda and this issue is no longer existent. Thank you so much for your help, I think we can close this issue now -- once and for all.

Martinsos commented 3 years ago

Well that is awesome @ayaanhossain ! I believe wheels might have helped with this, since it is probably not getting built from the source any more. Thanks again for all the help :).