Open jakirkham opened 10 months ago
@jakirkham I'm a bit confused by what specifically "cached" means in your description, but I came here to report an issue that felt like the conda equivalent of #69 because that's how @vyasr described this issue in https://github.com/rapidsai/build-planning/issues/69#issuecomment-2146202627.
In cugraph
24.10, near the end of burndown, we observed a situation where 24.12 packages were getting pulled in on CI jobs targeting branch-24.10
(https://github.com/rapidsai/cugraph/pull/4690).
It seems like the root cause was this pattern, that's used across RAPIDS CI scripts:
# create an environment
rapids-dependency-file-generator \
--output conda \
--file-key test_notebooks \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION}" | tee env.yaml
rapids-mamba-retry env create --yes -f env.yaml -n test
# download packages built in earlier CI jobs
CPP_CHANNEL=$(rapids-download-conda-from-s3 cpp)
PYTHON_CHANNEL=$(rapids-download-conda-from-s3 python)
# install those packages
rapids-mamba-retry install \
--channel "${CPP_CHANNEL}" \
--channel "${PYTHON_CHANNEL}" \
libcugraph pylibcugraph cugraph
Adding those locally-downloaded files with --channel
doesn't force conda install
to use the packages... it's free to choose other packages to satisfy constraints like "cugraph"
. Including those from other RAPIDS releases!
It would be safer for CI purposes to do something like this for that last install:
RAPIDS_VERSION="$(rapids-version-major-minor)"
rapids-mamba-retry install \
--channel "${CPP_CHANNEL}" \
--channel "${PYTHON_CHANNEL}" \
"libcugraph=${RAPIDS_VERSION}.*" \
"pylibcugraph=${RAPIDS_VERSION}.*" \
"cugraph=${RAPIDS_VERSION}.*"
That would prevent the solver from ever falling back to older releases (e.g. using 24.08 packages on the 24.10 branch) picking up newer releases (e.g. using 24.12 packages on the 24.10 branch).
I think it's worth making changes like that across RAPIDS in 24.12, what do you think?
cc @bdice @msarahan @KyleFromNVIDIA
Meaning the packages from S3
TBH this is something that would be solved by strict channel priority
Got it, ok.
I don't think the particular issue I just noted about would be solved by strict channel priority. It's a case where conda chose 24.12.*
packages from the rapidsai-nightly
channel when we wanted to use 24.10.*
packages from the rapidsai-nightly
channel.
Hmm...am not following
With your use case, it sounds like cuGraph packages should be in ${CPP_CHANNEL}
and ${PYTHON_CHANNEL}
correct?
If we added those channels higher in the channel order than rapidsai-nightly
and enabled strict channel priority, it would only grab the cuGraph packages from ${CPP_CHANNEL}
and ${PYTHON_CHANNEL}
and ignore rapidsai-nightly
. This is what we want, right?
What am I missing?
🙈 you're absolutely right John, the 24.10.*
are coming from those local channels, not rapidsai-nightly
, and strict channel priority would solve this. Sorry, I was not thinking about this the right way. Thanks for correcting me.
We might still want to consider something like the script changes I mentioned above, since they're a relatively small amount of effort compared to enforcing strict channel priority.
All good. Just wanted to make sure I wasn't missing something
Agree that would be a good improvement. We did the same thing in cuCIM: https://github.com/rapidsai/cucim/pull/680
Another improvement that has been helpful has been this suggestion by Charles ( https://github.com/rapidsai/build-planning/issues/22 ). It is good both for overall runtime and reliability
The reason that I likened this issue to #69 is that fundamentally --find-links
is analogous to adding a conda channel because it gives package manager an extra place to pull dependencies from. The main difference between pip and conda in this regard is that pip has no concept of channel priority, so with pip you had to introduce constraints in order to force installation of the desired file. With conda, as John pointed out the channel priority should solve the problem. In the interim, though, the suggestion that you proposed would work and would be the closest direct port possible of the pip constraints approach that you've added.
Yep yep makes sense, thanks for saying you also support the idea. I'm gonna plan on doing this (adding the constraints in scripts) next week, on 24.12. Seems like a few hours of effort, well worth it in my opinion to minimize pain like what we're experiencing in cugraph
for 24.10 and trying to fix in https://github.com/rapidsai/cugraph/pull/4690
I created a sub-issue to track the work of updating those scripts: https://github.com/rapidsai/build-planning/issues/106.
Recently we ran into an issue on a project (cuCIM) where older packages of the project (
libcucim
&cucim
from23.12
) were installed instead of the most recent packages from the PR (24.02.00a*
). This made the installation look successful. However old issues that had been fixed in the development branch (branch-24.02
) were not getting picked upThis was ultimately caused by a solver issue. However we were not able to ascertain that until we pinned the packages installed in the test phase to the latest version. Then the solver issue became clear and we could work to resolve that
Think we should take a closer look at this issue and come up with a way to guarantee the cached packages are picked up as opposed to some other packages. Attempted to do this more directly by using the
<channel>::<package>
syntax, but this didn't work well with file based channels. Maybe there is a better way to do this