omerwe / polyfun

PolyFun (POLYgenic FUNctionally-informed fine-mapping)
MIT License
96 stars 22 forks source link

Install bgen from conda-forge? #86

Closed jdblischak closed 2 years ago

jdblischak commented 2 years ago

I got an error when trying to install the polyfun conda environment on macOS because bgen failed to be installed by pip.

https://github.com/omerwe/polyfun/blob/29c37eb8ededdda195b2706b903d89691930ff40/polyfun.yml#L28-L29

Looking at the commit that added this, I wasn't sure why it was installed with pip instead of conda.

https://github.com/omerwe/polyfun/commit/ecc80ad98b8db0a731762e14956688e5c71ac8c5#diff-fa1cae805b4b5bff921455ed78ce95e52e6e7c34f9b5e76c20a540f002787388

It's available from conda-forge, and conda is the recommended installation method in the bgen README.

https://anaconda.org/conda-forge/bgen https://github.com/limix/bgen#install

Could polyfun.yml be updated to install bgen from conda-forge, or is there a reason it needs to be downloaded from PyPI and compiled?

omerwe commented 2 years ago

Hi @jdblischak,

There are actually two different Python packages called bgen...

The version that PolyFun uses (installed via pip install bgen) is this: https://github.com/jeremymcrae/bgen

The version that you pointed to is this: https://github.com/limix/bgen

Yes, it's confusing...

I think I tested both versions and found the first one (which is only available via pip install bgen better suited for my needs, though I don't remember the details, sorry...

Can you please share details of why the pip install bgen command failed?

jdblischak commented 2 years ago

There are actually two different Python packages called bgen...

Yes, it's confusing...

Oh jeez, that is confusing. And unfortunately jeremymcrae/bgen isn't available from either conda-forge or bioconda.

Can you please share details of why the pip install bgen command failed?

Here's what I see when I try to install bgen on macOS. Looks like a compiler issue.

mamba create -y -n test-bgen python=3.10 pip
conda activate test-bgen
pip install bgen
Collecting bgen
  Using cached bgen-1.2.15.tar.gz (165 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting numpy
  Downloading numpy-1.22.2-cp310-cp310-macosx_10_14_x86_64.whl (17.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.6/17.6 MB 8.7 MB/s eta 0:00:00
Building wheels for collected packages: bgen
  Building wheel for bgen (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for bgen (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [80 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.macosx-10.9-x86_64-3.10
      creating build/lib.macosx-10.9-x86_64-3.10/bgen
      copying src/bgen/index.py -> build/lib.macosx-10.9-x86_64-3.10/bgen
      copying src/bgen/__init__.py -> build/lib.macosx-10.9-x86_64-3.10/bgen
      running build_ext
      building 'bgen.reader' extension
      creating build/temp.macosx-10.9-x86_64-3.10
      creating build/temp.macosx-10.9-x86_64-3.10/src
      creating build/temp.macosx-10.9-x86_64-3.10/src/bgen
      clang -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /usr/local/Caskroom/miniforge/base/envs/test-bgen/include -fPIC -O2 -isystem /usr/local/Caskroom/miniforge/base/envs/test-bgen/include -Isrc/ -Isrc/zstd/lib -I/usr/local/Caskroom/miniforge/base/envs/test-bgen/include/python3.10 -c src/bgen.cpp -o build/temp.macosx-10.9-x86_64-3.10/src/bgen.o -std=c++11 -I/usr/include -stdlib=libc++ -mmacosx-version-min=10.9 -I/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include -mavx -mavx2
      In file included from src/bgen.cpp:2:
      In file included from src/bgen.h:8:
      In file included from src/header.h:11:
      In file included from src/utils.h:5:
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/cmath:317:9: error: no member named 'signbit' in the global namespace
      using ::signbit;
            ~~^
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/cmath:318:9: error: no member named 'fpclassify' in the global namespace
      using ::fpclassify;
            ~~^
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/cmath:319:9: error: no member named 'isfinite' in the global namespace; did you mean 'finite'?
      using ::isfinite;
            ~~^
      /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/math.h:752:12: note: 'finite' declared here
      extern int finite(double)
                 ^
      In file included from src/bgen.cpp:2:
      In file included from src/bgen.h:8:
      In file included from src/header.h:11:
      In file included from src/utils.h:5:
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/cmath:320:9: error: no member named 'isinf' in the global namespace
      using ::isinf;
            ~~^
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/cmath:321:9: error: no member named 'isnan' in the global namespace
      using ::isnan;
            ~~^
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/cmath:322:9: error: no member named 'isnormal' in the global namespace
      using ::isnormal;
            ~~^
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/cmath:323:7: error: no member named 'isgreater' in the global namespace; did you mean '::std::greater'?
      using ::isgreater;
            ^~
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/functional:738:29: note: '::std::greater' declared here
      struct _LIBCPP_TEMPLATE_VIS greater : binary_function<_Tp, _Tp, bool>
                                  ^
      In file included from src/bgen.cpp:2:
      In file included from src/bgen.h:8:
      In file included from src/header.h:11:
      In file included from src/utils.h:5:
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/cmath:324:7: error: no member named 'isgreaterequal' in the global namespace; did you mean '::std::greater_equal'?
      using ::isgreaterequal;
            ^~
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/functional:767:29: note: '::std::greater_equal' declared here
      struct _LIBCPP_TEMPLATE_VIS greater_equal : binary_function<_Tp, _Tp, bool>
                                  ^
      In file included from src/bgen.cpp:2:
      In file included from src/bgen.h:8:
      In file included from src/header.h:11:
      In file included from src/utils.h:5:
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/cmath:325:9: error: no member named 'isless' in the global namespace
      using ::isless;
            ~~^
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/cmath:326:9: error: no member named 'islessequal' in the global namespace
      using ::islessequal;
            ~~^
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/cmath:327:9: error: no member named 'islessgreater' in the global namespace
      using ::islessgreater;
            ~~^
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/cmath:328:9: error: no member named 'isunordered' in the global namespace
      using ::isunordered;
            ~~^
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/cmath:329:9: error: no member named 'isunordered' in the global namespace
      using ::isunordered;
            ~~^
      13 errors generated.
      error: command '/usr/bin/clang' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for bgen
Failed to build bgen
ERROR: Could not build wheels for bgen, which is required to install pyproject.toml-based projects
jdblischak commented 2 years ago

(and to clarify, this isn't a huge deal for me. I typically run PolyFun in an HPC setting. Yesterday I just happened to create and test my PR locally. I figured if bgen was available from conda-forge, we should use it. We could submit a new conda-forge recipe with a slightly altered name, but I don't have the bandwidth for that at the moment)

omerwe commented 2 years ago

Yep, the Python package ecosystem is a jungle. Sorry about this error, but I don't think I can do anything on my end. You could try opening an issue in the bgen package GitHub page.

If you ever get around to setting up a conda version, please let me know and I'll update PolyFun to use that!

I guess we can close this issue?

jdblischak commented 2 years ago

the Python package ecosystem is a jungle. Sorry about this error, but I don't think I can do anything on my end.

Agreed. This isn't a PolyFun issue. I mainly wanted clarification on the package choice. I never install code from source on my macOS (I always use conda), so I'd need to take the time to configure my local compilers first if I wanted to compile this package locally. Since I mainly use my HPC, I have little motivation to troubleshoot it.

If you ever get around to setting up a conda version, please let me know and I'll update PolyFun to use that!

I had a try at creating a recipe, but was unsuccessful. I'm leaving my notes here below in case I or someone else wants to continue working on this.

I now better understand the bgen ecoystem of packages provided by limix:

Instead of having multiple separate repos, jeremymcrae/bgen is a mono-repo where the C++ code is wrapped with Cython to use in the Python package. Unfortunately the use of Cython makes this a more complicated recipe to create. I couldn't even get conda skeleton pypi bgen to run. I got the following error:

  from Cython.Build import cythonize
ModuleNotFoundError: No module named 'Cython'

For whatever reason, when conda creates an environment to execute setup.py, it doesn't include Cython, and thus everything crashes and burns.

https://github.com/jeremymcrae/bgen/blob/d1708e97bcff9a743866383f2bbe6907be93fd64/setup.py#L10

I assumed that the use of Cython in a Python package to wrap C/C++ code would be a common use case, but I found surprisingly little advice when searching online. I could only find one other conda recipe that generated this error, and they cryptically solved it by installing the package from GitHub instead of PyPI (but it's not clear what the difference was between the GitHub dev version and published PyPI version, so I can't attempt to do the same).

https://github.com/conda-forge/staged-recipes/pull/13812#issuecomment-769350454

Looking through the conda-build repo, there are multiple Issues about this problem of executing setup.py without first installing the build dependencies. However, they are all years old, and the problem was supposedly fixed.

https://github.com/conda/conda-build/issues/2765 https://github.com/conda/conda-build/issues/1439 https://github.com/conda/conda-build/pull/1431

So with few examples and the problem supposedly already solved, I don't know what to try next. If anyone reading this is an expert in Cython and conda, your advice would be much appreciated!

bschilder commented 2 years ago

@jdblischak just FYI, I was wondering if the build errors were at all related to the Python version so I tried python=3.7, but got the same error message you did:

⋊> ~/Desktop conda  create -y -n test-bgen python=3.7 pip                                             (base) 14:23:53
⋊> ~/Desktop conda activate test-bgen                                                                 (base) 14:24:30
⋊> ~/Desktop pip install bgen                                                                    (test-bgen) 14:24:51
Collecting bgen
  Using cached bgen-1.2.15.tar.gz (165 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting numpy
  Downloading numpy-1.21.5-cp37-cp37m-macosx_10_9_x86_64.whl (16.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.9/16.9 MB 36.7 MB/s eta 0:00:00
Building wheels for collected packages: bgen
  Building wheel for bgen (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for bgen (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [80 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.macosx-10.9-x86_64-3.7
      creating build/lib.macosx-10.9-x86_64-3.7/bgen
      copying src/bgen/index.py -> build/lib.macosx-10.9-x86_64-3.7/bgen
      copying src/bgen/__init__.py -> build/lib.macosx-10.9-x86_64-3.7/bgen
      running build_ext
      building 'bgen.reader' extension
      creating build/temp.macosx-10.9-x86_64-3.7
      creating build/temp.macosx-10.9-x86_64-3.7/src
      creating build/temp.macosx-10.9-x86_64-3.7/src/bgen
      gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/opt/anaconda3/envs/test-bgen/include -arch x86_64 -I/opt/anaconda3/envs/test-bgen/include -arch x86_64 -Isrc/ -Isrc/zstd/lib -I/opt/anaconda3/envs/test-bgen/include/python3.7m -c src/bgen.cpp -o build/temp.macosx-10.9-x86_64-3.7/src/bgen.o -std=c++11 -I/usr/include -stdlib=libc++ -mmacosx-version-min=10.9 -I/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include -mavx -mavx2
      In file included from src/bgen.cpp:2:
      In file included from src/bgen.h:8:
      In file included from src/header.h:11:
      In file included from src/utils.h:5:
      /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/cmath:321:9: error: no member named 'signbit' in the global namespace
      using ::signbit;
            ~~^
      /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/cmath:322:9: error: no member named 'fpclassify' in the global namespace
      using ::fpclassify;
            ~~^
      /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/cmath:323:9: error: no member named 'isfinite' in the global namespace; did you mean 'finite'?
      using ::isfinite;
            ~~^
      /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/math.h:752:12: note: 'finite' declared here
      extern int finite(double)
                 ^
      In file included from src/bgen.cpp:2:
      In file included from src/bgen.h:8:
      In file included from src/header.h:11:
      In file included from src/utils.h:5:
      /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/cmath:324:9: error: no member named 'isinf' in the global namespace
      using ::isinf;
            ~~^
      /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/cmath:325:9: error: no member named 'isnan' in the global namespace
      using ::isnan;
            ~~^
      /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/cmath:326:9: error: no member named 'isnormal' in the global namespace
      using ::isnormal;
            ~~^
      /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/cmath:327:7: error: no member named 'isgreater' in the global namespace; did you mean '::std::greater'?
      using ::isgreater;
            ^~
      /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/functional:738:29: note: '::std::greater' declared here
      struct _LIBCPP_TEMPLATE_VIS greater : binary_function<_Tp, _Tp, bool>
                                  ^
      In file included from src/bgen.cpp:2:
      In file included from src/bgen.h:8:
      In file included from src/header.h:11:
      In file included from src/utils.h:5:
      /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/cmath:328:7: error: no member named 'isgreaterequal' in the global namespace; did you mean '::std::greater_equal'?
      using ::isgreaterequal;
            ^~
      /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/functional:767:29: note: '::std::greater_equal' declared here
      struct _LIBCPP_TEMPLATE_VIS greater_equal : binary_function<_Tp, _Tp, bool>
                                  ^
      In file included from src/bgen.cpp:2:
      In file included from src/bgen.h:8:
      In file included from src/header.h:11:
      In file included from src/utils.h:5:
      /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/cmath:329:9: error: no member named 'isless' in the global namespace
      using ::isless;
            ~~^
      /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/cmath:330:9: error: no member named 'islessequal' in the global namespace
      using ::islessequal;
            ~~^
      /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/cmath:331:9: error: no member named 'islessgreater' in the global namespace
      using ::islessgreater;
            ~~^
      /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/cmath:332:9: error: no member named 'isunordered' in the global namespace
      using ::isunordered;
            ~~^
      /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/v1/cmath:333:9: error: no member named 'isunordered' in the global namespace
      using ::isunordered;
            ~~^
      13 errors generated.
      error: command '/usr/bin/gcc' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for bgen
Failed to build bgen
ERROR: Could not build wheels for bgen, which is required to install pyproject.toml-based projects