Open lucascolley opened 1 month ago
I am GPU-less at the moment but can test later if needed.
GPU-less too, but I'm fairly sure that this didn't work and things in conda-forge are still broken:
% pixi ls -e array-api-cuda --platform linux-64 | rg "cudnn|jax"
Environment: array-api-cuda
cudnn 8.9.7.29 h092f7fd_3 446.6 MiB conda cudnn-8.9.7.29-h092f7fd_3.conda
jax 0.4.31 pyhd8ed1ab_0 1.3 MiB conda jax-0.4.31-pyhd8ed1ab_0.conda
jaxlib 0.4.31 cuda120py312h4008524_200 89.2 MiB conda jaxlib-0.4.31-cuda120py312h4008524_200.conda
So let's not unpin unless we can confirm on a machine with CUDA that the JAX and PyTorch tests are actually passing.
@rgommers, there's likely a bug in the solver you're using (since it is not supposed to pick up broken packages, and it should just fail), and it is being introduced by the fact that pytorch is required for the array-api-cuda env (which hasn't been rebuilt for cudnn yet, https://github.com/conda-forge/pytorch-cpu-feedstock/pull/259)
@wolfv any idea why we're picking up a broken package rather than failing here?
Did you remove the pixi.lock
and re-created from scratch or just update it without deleting it? If I recall correctly from some discussion with @baszalmstra were he was mentioning that the solvers tries as much as possible not to modify packages that are already in pixi.lock
, perhaps there is some kind of bug that does not avoid to use a broken package if it is already in the pixi.lock
file.
Did you remove the pixi.lock and re-created from scratch or just update it without deleting it?
I updated it without deleting it.
@lucascolley a broken package in the sense that this package was moved to broken already? I don't think that we would ever be able to resolve one of those ... since they are not part of the repodata anymore.
@lucascolley a broken package in the sense that this package was moved to broken already? I don't think that we would ever be able to resolve one of those ... since they are not part of the repodata anymore.
Yeah. I'll try deleting the lockfile and regenerating it, rather than just pixi update
, as per the suggestion above.
After deleting and regenerating the lockfile, a CUDA version of jaxlib 0.4.31 is still picked up:
jax 0.4.31 pyhd8ed1ab_0 1.3 MiB conda jax-0.4.31-pyhd8ed1ab_0.conda
jaxlib 0.4.31 cuda120py312h4008524_200 89.2 MiB conda jaxlib-0.4.31-cuda120py312h4008524_200.conda
Is that supposed to be marked as broken and impossible to pick up @traversaro ?
Okay. I think the "mark as broken" business didn't actually work :(
This PR corrects the labeling fiasco: https://github.com/conda-forge/admin-requests/pull/1065
Can you try again?
pixi update
now installs jaxlib 0.4.31 CPU, so the broken labelling is fixed :pray: . As @rgommers said above, it looks like we'll need some additional way to force CUDA when version conflicts occur.
@traversaro are you thinking of something like https://github.com/rgommers/pixi-dev-scipystack/pull/5/commits/fb8577b9485c913d4810596bfadb9a843d43d52e to force cuda?
@traversaro are you thinking of something like fb8577b to force cuda?
Yes, but I guess you need something similar for each package you want to ensure uses the cuda variant (i.e. pytorch).
Allegedly we can revert this pin now!
I am GPU-less at the moment but can test later if needed.