Closed hmaarrfk closed 10 months ago
Hi! This is the friendly automated conda-forge-linting service.
I just wanted to let you know that I linted all conda-recipes in your PR (recipe
) and found it was in an excellent condition.
Thanks Mark! π
It appears this is due to a change introduced in PyTorch 2.1.0's own CTK detection logic. Unfortunately this check is problematic
In the Conda case, we implement a splayed layout. This means build tools (like those listed in requirements/build
) live in one path and libraries that are linked to (like those listed in requirements/host
) live in another path. This is common when supporting cross-compilation as we do with Conda. However this means there are cases where we do need to use some things from both paths for different reasons
It looks like @ax3l ran into this issue in the HPC SDK use case and proposed a fix ( https://github.com/pytorch/pytorch/pull/108932 ). Idk if that will work for us
In terms of our needs here, maybe just patching the check out altogether would be reasonable to get the build working
Perhaps this would be a good opportunity to discuss with @peterbell10 whether we can come up with a better check in PyTorch that works for splayed layout use cases
ok lets try again, i took a brute force approach because we can do a bit of meta building
It now fails with:
CMake Error at cmake/public/cuda.cmake:64 (message):
Failed to find nvToolsExt
Call Stack (most recent call first):
cmake/Dependencies.cmake:44 (include)
CMakeLists.txt:722 (include)
Thanks Mark! π
Think that refers to this check added in the same PyTorch PR
As noted under the CUDA::nvToolsExt
doc in CMake, this is a deprecated target by CMake (and NVIDIA) as it comes from NVTX 2, NVTX 3 has superseded it
Am guessing PyTorch doesn't use NVTX 2 (as this check was new in that PR). Meaning that this was purely a build configuration check
So think we can remove those lines as well
can i remove it too?
ok well i pushed my changes, feel free to push anything if you can. going to slee....
Thanks Mark! π
Have a good night
It looks like Peter added a fix upstream ( https://github.com/pytorch/pytorch/pull/113174 ). Thanks Peter! π
Maybe we can give that a try
it doesn't address the nvtools issue
There are two upsteam PRs that solve the nvtools issue; I am not sure which one is preferable. See https://github.com/pytorch/pytorch/issues/101135 and PRs https://github.com/pytorch/pytorch/pull/97582 and https://github.com/pytorch/pytorch/pull/106763
ok builds are incoming:
My test was to use
import torch
a = torch.randn(1024 * 1024 * 1024, device='cuda')
a + 1
and watch the memory on my cuda device grow using nvtop
Nice work Mark! π₯³
@peterbell10 do the patches here look upstreamable to you? Or are there similar approaches that upstream could take that would alleviate the need for these?
Were Linux ARM packages built as well?
Edit: Nvm I see them π€¦ββοΈ
@jakirkham the same error happened again.
The CI should show it too:
Checklist
0
(if the version changed)conda-smithy
(Use the phrase code>@<space/conda-forge-admin, please rerender in a comment in this PR for automated rerendering)