kokkos / pykokkos-base

Python bindings for data interoperability with Kokkos (View, DynRankView)
Other
26 stars 9 forks source link

BLD: providing binary wheels on PyPI #41

Open tylerjereddy opened 2 years ago

tylerjereddy commented 2 years ago

@namehta4 @NaderAlAwar

Since pykokkos-base is not actually needed to build pykokkos (distinction of genuine build time vs. runtime dependencies), and because PEP518-based pip installs will build pykokkos on its own in an isolated env before installing it to a local user env, regardless of how we install pykokkos with pip, it will still be up to the user to provide a suitable version of pykokkos-base in their environment (the same would apply for providing a suitable version of NumPy when working with SciPy for example).

So, in the pip/PyPI ecosystem, I suspect the only way for us to reduce build/install friction is to:

The latter would likely be a substantial lift, and I'm not sure how we'd handle OMP, CUDA backend library shipping, though some libs like pytorch or tensorflow could likely be used as inspiration for that.

jrmadsen commented 2 years ago

I'm not sure how we'd handle OMP, CUDA backend library shipping, though some libs like pytorch or tensorflow could likely be used as inspiration for that.

Create conda-forge packaging.

tylerjereddy commented 2 years ago

Conda and PyPI are two completely different ecosystems, and mixing them is not really recommended.

namehta4 commented 2 years ago

Hi Jonathan, Hi Tyler

We have a conda package ready for Pykokkos-base I believe we also have a pypi version of pykokkos-base available (albeit with some issues with flags). I thnk the problem here is to create a one-line pip install for pykokkos which will install pykokkos-base as a dependency, which I was unable to do. It was possible to do this with pre-compiled binaries but not otherwise.

On Thu, Jun 30, 2022 at 11:12 AM Tyler Reddy @.***> wrote:

Conda and PyPI are two completely different ecosystems, and mixing them is not really recommended.

— Reply to this email directly, view it on GitHub https://github.com/kokkos/pykokkos-base/issues/41#issuecomment-1171529407, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOPG2XXWEXUGZJ5RL3PUD53VRXPR7ANCNFSM52KGMJJQ . You are receiving this because you were mentioned.Message ID: @.***>

-- Thanks and regards, Neil Mehta Performance Engineer, NERSC Lawrence Berkeley National Laboratory

jrmadsen commented 2 years ago

Conda and PyPI are two completely different ecosystems, and mixing them is not really recommended.

@tylerjereddy

Yes but when it comes to mixing C++ and Python, pip is not designed to handle compilation and build variants well at all. Pip is a python-specific package manager. Conda is a generic package manager. This makes a big difference when it comes to handling compilation and build variants. Thus, I've found that using pip for source installs and conda for pre-built is the best thing to do

jrmadsen commented 2 years ago

I thnk the problem here is to create a one-line pip install for pykokkos which will install pykokkos-base as a dependency

@namehta4

I am pretty sure that you could add pykokkos-base in your requirements.txt and then in your pykokkos setup.py (very early on) you could do os.environ["PYKOKKOS_BASE_SETUP_ARGS"] = "<cmake arguments>" so that when the requirements get resolved, pykokkos-base will inherit that environment variable. AFAIK, the pip build isolation just limits the scope of which packages you can import but doesn't clear the environment

tylerjereddy commented 2 years ago

SciPy and NumPy both mix C++ and Python (and Fortran for SciPy) and provide solutions in both ecosystems. I think tensorflow and pytorch do binaries in both ecosystems as well. If you don't want to provide a solution in the PyPI ecosystem fair enough, but there certainly are PEPs/standards for doing these kinds of things, it just takes time and @NaderAlAwar wanted us to look into it a bit.

I think the problem here is to create a one-line pip install for pykokkos which will install pykokkos-base as a dependency, which I was unable to do

pip install is meant to install a single package into the user env by design, so that isn't going to work really. A user may have their own pykokkos-base they won't want to supersede, etc., so that's why I'm suggesting that we use a PyPI wheel to at least simplify the manual provision of a binary. If @jrmadsen wants to document/encourage a conda install in an env that is otherwise PyPI based, fair enough, though I can't recommend that in general.

I am pretty sure that you could add pykokkos-base in your requirements.txt and then in your pykokkos setup.py (very early on) you could do os.environ["PYKOKKOS_BASE_SETUP_ARGS"] = "" so that when the requirements get resolved, pykokkos-base will inherit that environment variable. AFAIK, the pip build isolation just limits the scope of which packages you can import but doesn't clear the environment

I wouldn't recommend this--you're polluting the user environment with packages they didn't ask for, which is part of what the various PEPs/standards are designed to protect against.

jrmadsen commented 2 years ago

SciPy and NumPy both mix C++ and Python (and Fortran for SciPy) and provide solutions in both ecosystems.

Yes, but they don't have to deal with multiple backends causing build variants.

I think tensorflow and pytorch do binaries in both ecosystems as well.

Yes but those are wheels without any acceleration. If anything, I think it would be better to just create a "new packages" like pykokkos-base-openmp which are wheels for variants.

tylerjereddy commented 2 years ago

Yes, but they don't have to deal with multiple backends causing build variants.

There are variants in the linear algebra backends the libraries are built against. At least for SciPy that currently means choosing a variant at the moment and just shipping that as the default (i.e., OpenBLAS instead of the reference imlementation or the Intel MKL stuff).

conda is indeed currently better for swapping backends, but there's still a tendency to serve both communities/ecosystems because they're both large, etc.

jrmadsen commented 2 years ago

I just did a pip install pykokkos-base and ran into an issue because it defaulted to enabling CUDA and I didn't have nvcc in my path. All I had to do was do export PYKOKKOS_BASE_SETUP_ARGS="-DENABLE_CUDA=OFF". The quickest, easiest fix to make it less painful would be to just not default to CUDA because once that was fixed, installing from source took less than 3 minutes.