rapidsai / build-planning

Tracking for RAPIDS-wide build tasks
https://github.com/rapidsai
0 stars 3 forks source link

Support (selectively) preventing CPM downloads in conda builds #91

Open vyasr opened 1 month ago

vyasr commented 1 month ago

This issue is closely related to #54. We typically see clobbering in conda packages because for one reason or another CPM does not find a dependency locally, triggering a download of the dependency that is then installed into the conda environment. We generally want to use CPM because for local builds it should be possible for users to build without these dependencies preinstalled, and there are non-conda builds where we rely on this behavior (e.g. in wheel builds). CPM has an option CPM_USE_LOCAL_PACKAGES that we could set in conda builds to prevent downloads, but we can't use this option as-is because there are certain dependencies that we do want to download (in particular, we always download gtest in test builds so that we can use a specific version of the static library for gtest without introducing any other effects on the environment). Therefore, what we need is a more fine-grained option for controlling downloading on a per-package basis. Ideally, what we would want is to set CPM_USE_LOCAL_PACKAGES but allow downloading on a per-project basis, i.e. an allowlist rather than a blocklist. We could consider implementing this in rapids-cmake via having rapids_cpm_find use find_package rather than CPM(Add|Find)Package, but this approach is suboptimal since rapids-cmake would have to reimplement a lot of CPM logic in order to fully support the same combination of properties. Therefore, we should probably propose a suitable change to CPM. If others agree, I'm happy to open the corresponding upstream issue. If we do decide to do this in rapids-cmake, I can open an issue there instead.

robertmaynard commented 1 month ago

I think what we can propose is an expansion of CPM_DOWNLOAD_<package> that when it is explicitly set to OFF it also disables this conditional https://github.com/cpm-cmake/CPM.cmake/blob/master/cmake/CPM.cmake#L306 so that we don't download a package once CPMFindPackage fails.

This keeps the complexity of the CPM options in check ( e.g. how does it reconcile CPM_DOWNLOAD_<package> and CPM_LOCAL_ONLY_<package>. )

vyasr commented 1 month ago

That is a pretty good option. It has the downside of being a blocklist rather than an allowlist, which is the opposite of what I mentioned wanting in the issue above, but it does provide a pretty unambiguous reconciliation of the different CPM options which is a major point in its favor.

robertmaynard commented 1 month ago

While I agree that allowlist are the preferred approach we are trapped by my previous design choice of CPM_DOWNLOAD_<package>.

The options I see going forward are to deprecate CPM_DOWNLOAD_<package> and replace it with a trinary state variable like CPM_<PACKAGE>_BEHAVIOR which supports values like DOWNLOAD_ONLY, LOCAL_ONLY, PREFER_LOCAL... . Or we update CPM_DOWNLOAD_<package> to be both an allow and block state.

kkraus14 commented 1 month ago

I had previously raised this issue in CPM for specifically this reason: https://github.com/cpm-cmake/CPM.cmake/issues/432

Since I last poked around at this, CMake Dependency Providers (https://cmake.org/cmake/help/latest/command/cmake_language.html#dependency-providers) are now a thing where there might be a better option than CPM to handle this today?

robertmaynard commented 1 month ago

One of the major issues ( or even show stopper ) with dependency providers is outlined in this paragraph

The dependency provider can only be set while processing one of the files specified by the variable. 
Thus, dependency providers can only be set as part of the first call to project() outside of that 
context will result in an error.

So by using a dependency provider approach we won't be able to support users that fetch a rapids-cmake based project via FetchContent / CPM itself...

While I have only spend a couple of hours trying to write a dependency provider for CPM I remember running into problems with recursion ( A->B->C->A ) that https://cmake.org/cmake/help/latest/command/cmake_language.html#id3 didn't seem to help with. But that was like a year ago, and new explorations of the feature would need to happen.

vyasr commented 1 month ago

It seems like we have a few options, but the best long-term options require pretty substantial short-term pain in the form of breaking changes (hopefully with deprecations).