Open cfhammill opened 3 months ago
Patching cupy to find libcudart_static.a
in lib
instead of lib64
allows the build to proceed, but it still does not succeed in building. I'm getting g++ compilation errors that I don't have context for.
downdating to cupy 12.3.0 builds successfully, 12.3.0 doesn't use the static lib so the build failure above doesn't occur.
succeed in building. I'm getting g++ compilation errors that I don't have context for.
Hi could you gist them
They're pretty much all the same, but with different specific variables not in scope.
cupy_backends/cuda/libs/cutensor.cpp:7219:79: error: ‘cutensorPlan_t’ was not declared in this scope; did you mean ‘cutensorAlgo_t’? 7219 | __pyx_v_status = cutensorReduce(((cutensorHandle_t)__pyx_v_handle), ((cutensorPlan_t)__pyx_v_plan), ((void *)__pyx_v_alpha), ((void *)__pyx_v_A), ((void *)__pyx_v_beta), ((void *)__pyx_v_C), ((void *)__pyx_v_D), ((void *)__pyx_v_workspace), __pyx_v_workspaceSize, ((cudaStream_t)__pyx_v_stream));
| ^~~~~~~~~~~~~~ | cutensorAlgo_t cupy_backends/cuda/libs/cutensor.cpp:7219:94: error: expected ‘)’ before ‘__pyx_v_plan’ 7219 | __pyx_v_status = cutensorReduce(((cutensorHandle_t)__pyx_v_handle), ((cutensorPlan_t)__pyx_v_plan), ((void *)__pyx_v_alpha), ((void *)__pyx_v_A), ((void *)__pyx_v_beta), ((void *)__pyx_v_C), ((void *)__pyx_v_D), ((void *)__pyx_v_workspace), __pyx_v_workspaceSize, ((cudaStream_t)__pyx_v_stream));
| ~ ^~~~~~~~~~~~ | ) cupy_backends/cuda/libs/cutensor.cpp: In function ‘PyObject* __pyx_f_13cupy_backends_4cuda_4libs_8cutensor_destroyOperationDescriptor(intptr_t, int)’: cupy_backends/cuda/libs/cutensor.cpp:7482:63: error: ‘cutensorOperationDescriptor_t’ was not declared in this scope; did you mean ‘cutensorContractionDescriptor_t’?
7482 | __pyx_v_status = cutensorDestroyOperationDescriptor(((cutensorOperationDescriptor_t)__pyx_v_desc)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | cutensorContractionDescriptor_t cupy_backends/cuda/libs/cutensor.cpp:7482:93: error: expected ‘)’ before ‘__pyx_v_desc’ 7482 | __pyx_v_status = cutensorDestroyOperationDescriptor(((cutensorOperationDescriptor_t)__pyx_v_desc)); | ~ ^~~~~~~~~~~~ | )
my guess is the cutensor version isn't high enough.
@SomeoneSerge updating to cutensor 2.0.2 fixed the build in combination with my cupy patch. Is there an established process for editing the manifest files: https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/cuda-modules/cutensor/manifests/ to add the hashes/sizes for all archs? I hacked in linux-x86_64 by hand to get it to work.
This doesn't seem to be an issue anymore on b4bc024641b3c877bd0ab7b45c34099da8279d53 for python310Packages.cupy
, python311Packages.cupy
, or python312Packages.cupy
.
Steps To Reproduce
Build log
full log: https://gist.github.com/cfhammill/22616c79dfe5a1d19755bf0eb51cfddf
seemingly relevant sections include
and
which is interesting because replacing
lib64
withlib
in the path above does point to a statically compiled libcudart.Notify maintainers
@samuela @SomeoneSerge @hyphon81
Metadata
Please run
nix-shell -p nix-info --run "nix-info -m"
and paste the result.Add a :+1: reaction to issues you find important.