conda-forge / cupy-feedstock

A conda-smithy repository for cupy.
BSD 3-Clause "New" or "Revised" License
5 stars 23 forks source link

Attempt to make a "CPU" only version #260

Closed hmaarrfk closed 6 months ago

hmaarrfk commented 6 months ago

The goal would be to avoid CPUs users having to download a 700 MB download if they add this as a dependency. Checklist

conda-forge-webservices[bot] commented 6 months ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

leofang commented 6 months ago

Hmmm... Could you please just install cupy-core in a CPU-only env?

hmaarrfk commented 6 months ago

Hmmm... Could you please just install cupy-core in a CPU-only env?

I'll try. I take it that import cupy with full python namespace should exist in your opinion?

leofang commented 6 months ago

Yes that's right.

hmaarrfk commented 6 months ago

Yes that's right.

I anticipated that this would be the case. Might be a pretty big uphill battle, but I'll likely try periodically over the year.

jakirkham commented 6 months ago

Hey Mark, could you please help us understand what issues you are still encountering?

On my MacBook Pro M1 (no NVIDIA GPU), I was able to use one of our conda-forge containers, install cupy-core, and import cupy without issues. Please see details below:

``` $ docker run --rm -it quay.io/condaforge/linux-anvil-aarch64 [conda@31081928c771 ~]$ conda activate (base) [conda@31081928c771 ~]$ conda create -n cupy python=3.11 cupy-core=13 Channels: - conda-forge Platform: linux-aarch64 Collecting package metadata (repodata.json): done Solving environment: done ## Package Plan ## environment location: /opt/conda/envs/cupy added / updated specs: - cupy-core=13 - python=3.11 The following packages will be downloaded: package | build ---------------------------|----------------- _openmp_mutex-4.5 | 2_gnu 23 KB conda-forge cuda-version-12.4 | h3060b56_3 21 KB conda-forge cupy-core-13.0.0 | py311h51108e9_3 41.9 MB conda-forge fastrlock-0.8.2 | py311h8715677_2 37 KB conda-forge libblas-3.9.0 |21_linuxaarch64_openblas 14 KB conda-forge libcblas-3.9.0 |21_linuxaarch64_openblas 14 KB conda-forge libgfortran-ng-13.2.0 | he9431aa_5 23 KB conda-forge libgfortran5-13.2.0 | h582850c_5 1.0 MB conda-forge liblapack-3.9.0 |21_linuxaarch64_openblas 14 KB conda-forge libopenblas-0.3.26 |pthreads_h5a5ec62_0 4.1 MB conda-forge numpy-1.26.4 | py311h69ead2a_0 6.9 MB conda-forge python-3.11.8 |h43d1f9e_0_cpython 14.6 MB conda-forge python_abi-3.11 | 4_cp311 6 KB conda-forge ------------------------------------------------------------ Total: 68.7 MB The following NEW packages will be INSTALLED: _openmp_mutex conda-forge/linux-aarch64::_openmp_mutex-4.5-2_gnu bzip2 conda-forge/linux-aarch64::bzip2-1.0.8-h31becfc_5 ca-certificates conda-forge/linux-aarch64::ca-certificates-2024.2.2-hcefe29a_0 cuda-version conda-forge/noarch::cuda-version-12.4-h3060b56_3 cupy-core conda-forge/linux-aarch64::cupy-core-13.0.0-py311h51108e9_3 fastrlock conda-forge/linux-aarch64::fastrlock-0.8.2-py311h8715677_2 ld_impl_linux-aar~ conda-forge/linux-aarch64::ld_impl_linux-aarch64-2.40-h2d8c526_0 libblas conda-forge/linux-aarch64::libblas-3.9.0-21_linuxaarch64_openblas libcblas conda-forge/linux-aarch64::libcblas-3.9.0-21_linuxaarch64_openblas libexpat conda-forge/linux-aarch64::libexpat-2.6.2-h2f0025b_0 libffi conda-forge/linux-aarch64::libffi-3.4.2-h3557bc0_5 libgcc-ng conda-forge/linux-aarch64::libgcc-ng-13.2.0-hf8544c7_5 libgfortran-ng conda-forge/linux-aarch64::libgfortran-ng-13.2.0-he9431aa_5 libgfortran5 conda-forge/linux-aarch64::libgfortran5-13.2.0-h582850c_5 libgomp conda-forge/linux-aarch64::libgomp-13.2.0-hf8544c7_5 liblapack conda-forge/linux-aarch64::liblapack-3.9.0-21_linuxaarch64_openblas libnsl conda-forge/linux-aarch64::libnsl-2.0.1-h31becfc_0 libopenblas conda-forge/linux-aarch64::libopenblas-0.3.26-pthreads_h5a5ec62_0 libsqlite conda-forge/linux-aarch64::libsqlite-3.45.2-h194ca79_0 libstdcxx-ng conda-forge/linux-aarch64::libstdcxx-ng-13.2.0-h9a76618_5 libuuid conda-forge/linux-aarch64::libuuid-2.38.1-hb4cce97_0 libxcrypt conda-forge/linux-aarch64::libxcrypt-4.4.36-h31becfc_1 libzlib conda-forge/linux-aarch64::libzlib-1.2.13-h31becfc_5 ncurses conda-forge/linux-aarch64::ncurses-6.4.20240210-h0425590_0 numpy conda-forge/linux-aarch64::numpy-1.26.4-py311h69ead2a_0 openssl conda-forge/linux-aarch64::openssl-3.2.1-h31becfc_1 pip conda-forge/noarch::pip-24.0-pyhd8ed1ab_0 python conda-forge/linux-aarch64::python-3.11.8-h43d1f9e_0_cpython python_abi conda-forge/linux-aarch64::python_abi-3.11-4_cp311 readline conda-forge/linux-aarch64::readline-8.2-h8fc344f_1 setuptools conda-forge/noarch::setuptools-69.2.0-pyhd8ed1ab_0 tk conda-forge/linux-aarch64::tk-8.6.13-h194ca79_0 tzdata conda-forge/noarch::tzdata-2024a-h0c530f3_0 wheel conda-forge/noarch::wheel-0.43.0-pyhd8ed1ab_1 xz conda-forge/linux-aarch64::xz-5.2.6-h9cdd2b7_0 Proceed ([y]/n)? Downloading and Extracting Packages: Preparing transaction: done Verifying transaction: done Executing transaction: done # # To activate this environment, use # # $ conda activate cupy # # To deactivate an active environment, use # # $ conda deactivate (base) [conda@31081928c771 ~]$ conda activate cupy (cupy) [conda@31081928c771 ~]$ python -c "import cupy" ; echo $? 0 ```
hmaarrfk commented 6 months ago

Please see https://github.com/conda-forge/staged-recipes/pull/25925

I want, in order of priority:

  1. The final package to be installable on all platforms, Win, OSX, Linux, with and without cuda
  2. cupy to be installed "by default" to enable my package to use hardware acceleration.
  3. avoid a 700 MB of downloads for CPU-only users for cuda libraries.

Other packages, tensorflow, pytorch, have a "cpu-only" version that allows packages to "depend on them" without incurring the large downloads

jakirkham commented 6 months ago

Could you please share a bit more on why macOS is needed? AFAICT that hasn't been supported by CuPy in a while ( https://github.com/cupy/cupy/pull/3857 )

Otherwise it sounds like cupy-core maps pretty closely to what you are looking for

The other consideration would be to use the Array API in the library, which both NumPy and CuPy support. That way the code can seamlessly work on CPU or GPU based on the array type provided

hmaarrfk commented 6 months ago

The concern is not for runtime, but for the packaging time.

Could you please share a bit more on why macOS is needed?

I would like to avoid having OSX and CUDA variants of my package.

Otherwise it sounds like cupy-core maps pretty closely to what you are looking for

yes. close, but doesn't hit Point 1, nor 3.

hmaarrfk commented 6 months ago

The other consideration would be to use the Array API in the library

Maybe our cupy is a little rusty, we could not get around calls to array.get() to get the results of the computation into CPU memory.

jakirkham commented 6 months ago

Could you please explain why cupy-core doesn't meet 3? It doesn't pull in any CUDA libraries. They are optional and constrained

leofang commented 6 months ago

Sorry for my brief reply, I was on a ferry and could not expand further...

CuPy does not support macOS because neither CUDA nor ROCm (its two underlying accelerator backends) supports it. Note that CuPy does not even has a CPU backend; NumPy was meant to be used for that purpose. As John noted, CuPy's legacy macOS support has been long broken and I removed it, so you wouldn't even be able to build CuPy on macOS and I'd suggest to not waste time trying. If you think a skeleton package (which by my definition is importable but not functional) for macOS is absolutely needed, please open an issue in CuPy to discuss. But other than this, I think cupy-core, offered starting CuPy v13, should meet the rest of your needs exactly.

hmaarrfk commented 6 months ago

Software support isn't needed. Mostly just "ease of optionally depending on my environment and recipes "

I'll likely make my own special package outside of conda forge for this and see what learnings I can share.

leofang commented 6 months ago

I wonder if you could simply depend on Apple MLX (to replace the role of CuPy on macOS)? It's available on conda-forge, though I haven't had a chance to try (OS version too old).