rapidsai / build-planning

Tracking for RAPIDS-wide build tasks
https://github.com/rapidsai
0 stars 3 forks source link

Standardize run exports (and make analogous choices for pip packaging) for header-only libraries #92

Open vyasr opened 1 month ago

vyasr commented 1 month ago

We are currently inconsistent with how header-only packages handle exports in conda. Consider:

With the ongoing work to add C++ RAPIDS wheels, we're going to see even more related issues like this because wheels have no concept like run exports, so if a header-only C++ package is a dependency of another one, then any consumer aiming to build against the second package needs the first available even though it is not actually a runtime requirement (concretely: a libcuspatial wheel would require a librmm wheel via libcudf to compile, but at runtime libcuspatial would only require libcudf and not librmm).

As discussed offline, we should remove the run exports in conda and be consistent in both pip and conda packaging by saying that consumers building against a package are responsible for bringing in transitive build-time dependencies that are not runtime dependencies. In particular, this means that transitive header-only dependencies must be explicitly specified by consumers at build time. This approach keeps runtime environments as slim as possible at the expensive of making package specs more verbose.

vyasr commented 1 day ago

Some concerns raised by @wence- in https://github.com/rapidsai/rmm/pull/1673#issuecomment-2338682968 boil down to the fact that while header-only libraries are not required at runtime, there does need to be some way of indicating an ABI constraint such that all libraries in an environment that were built against some header-only library used ABI-compatible versions of that library. Concretely, without this you could have the user pull a librmm device_buffer out of a libcudf column and pass it to libcuml only to discover that under the hood libcudf and libcuml were compiled with incompatible versions of librmm (notice that I'm using the C++ libraries because in the Python case there is an implicit constraint by virtue of all the Python packages depending on rmm at runtime).

The ideal solution to fix this problem would be to define an ABI version metapackage for every header-only package so that dependents could use the metapackage as a runtime dependency to mutually constrain the ABI without needing to pull the headers into the runtime environment. However, that solution isn't ideal for us right now for a few reasons:

  1. RAPIDS libraries don't actually promise any ABI stability, so the metapackage's existence would be somewhat misleading.
  2. If RAPIDS libraries ever did want to start promising ABI stability, using a CalVer-based metapackage now would force starting future ABI versioning from some confusingly high number.
  3. The metapackage would add extra packaging overhead with minimal benefits now since our headers are quite small.

The upshot of this is that we may want to keep the run exports for now as an implicit ABI constraint. There is no equivalent solution for pip, but that should be less problematic since pip installation is really on relevant for the Python packages and those implicitly inherit the same constraint from the rmm Python package as mentioned above.