Closed RaulPPelaez closed 6 months ago
Looks good.
It looks like we are running out of space on the GHA worker
Could borrow this GHA logic to free up some space from the GHA images
Looks like the trick to save space works!
I cannot convince CMake to find CUDA 12, nor in the worker or locally. I have to trick caffe2 by setting its internal variables. I tried setting CUDA_HOME and all its friends to no avail.
-- Caffe2: Protobuf version 25.1.0
-- Could NOT find CUDA (missing: CUDA_INCLUDE_DIRS) (found version "12.0")
CMake Warning at /home/runner/miniconda3/envs/build/share/cmake/Caffe2/public/cuda.cmake:31 (message):
Caffe2: CUDA cannot be found. Depending on whether you are building Caffe2
or a Caffe2 dependent library, the next warning / error will give you more
info.
Call Stack (most recent call first):
/home/runner/miniconda3/envs/build/share/cmake/Caffe2/Caffe2Config.cmake:87 (include)
/home/runner/miniconda3/envs/build/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:15 (FIND_PACKAGE)
CMake Error at /home/runner/miniconda3/envs/build/share/cmake/Caffe2/Caffe2Config.cmake:91 (message):
Your installed Caffe2 version uses CUDA but I cannot find the CUDA
libraries. Please set the proper CUDA prefixes and / or install CUDA.
Call Stack (most recent call first):
/home/runner/miniconda3/envs/build/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:15 (FIND_PACKAGE)
-- Configuring incomplete, errors occurred!
Error: Process completed with exit code 1.
OSX is failing with some linker issues in OpenCL:
3/5 Test #3: TestOpenCLTorchForceSingle .......Subprocess aborted***Exception: 2.22 sec
pocl warning: encountered incomplete implementation in /Users/runner/miniforge3/conda-bld/pocl-core_1707453645620/work/lib/CL/clGetDeviceInfo.c:98
WARNING: Using an unsupported OpenCL implementation. Results may be incorrect.
ld: dynamic main executables must link with libSystem.dylib for architecture x86_64
error: linker command failed with exit code 1 (use -v to see invocation)
Final linking of kernel determineNativeAccuracy failed.
Start 4: TestOpenCLTorchForceMixed
4/5 Test #4: TestOpenCLTorchForceMixed ........Subprocess aborted***Exception: 1.68 sec
pocl warning: encountered incomplete implementation in /Users/runner/miniforge3/conda-bld/pocl-core_1707453645620/work/lib/CL/clGetDeviceInfo.c:98
WARNING: Using an unsupported OpenCL implementation. Results may be incorrect.
ld: dynamic main executables must link with libSystem.dylib for architecture x86_64
error: linker command failed with exit code 1 (use -v to see invocation)
Final linking of kernel determineNativeAccuracy failed.
Start 5: TestOpenCLTorchForceDouble
5/5 Test #5: TestOpenCLTorchForceDouble .......Subprocess aborted***Exception: 4.68 sec
pocl warning: encountered incomplete implementation in /Users/runner/miniforge3/conda-bld/pocl-core_1707453645620/work/lib/CL/clGetDeviceInfo.c:98
WARNING: Using an unsupported OpenCL implementation. Results may be incorrect.
ld: dynamic main executables must link with libSystem.dylib for architecture x86_64
error: linker command failed with exit code 1 (use -v to see invocation)
Final linking of kernel clearThreeBuffers failed.
Building for the latest Conda forge pytorch version (2.1.0) fails because C++17 is required. This PR sets C++17 in CMake for versions equal or larger than this. Fixes this conda-forge build: https://github.com/conda-forge/openmm-torch-feedstock/pull/46