ComputeCanada / software-stack

Repository to host issues relative to the Compute Canada software stack
12 stars 0 forks source link

OpenCL ICD search path is bad in StdEnv/2023. #139

Closed twhitehead closed 6 months ago

twhitehead commented 6 months ago

Allocating a GPU and running clinfo reports back a GPU in 2020. In 2023 it gives nothing

[tyson@cdr2604 lammps]$ clinfo
Number of platforms                               0

ICD loader properties
  ICD loader Name                                 Khronos OpenCL ICD Loader
  ICD loader Vendor                               Khronos Group
  ICD loader Version                              3.0.6
  ICD loader Profile                              OpenCL 3.0

Stracing shows it only looks under /cvmfs/soft.computecanada.ca/gentoo/2023/x86-64-v3/etc/OpenCL

openat(AT_FDCWD, "/cvmfs/soft.computecanada.ca/gentoo/2023/x86-64-v3/etc/OpenCL/vendors", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/cvmfs/soft.computecanada.ca/gentoo/2023/x86-64-v3/etc/OpenCL/layers", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = -1 ENOENT (No such file or directory)

If you override by exporting OCL_ICD_VENDORS=/etc/OpenCL/vendors, then it works again, showing that the bad path is the full issue.

[tyson@cdr2604 lammps]$ OCL_ICD_VENDORS=/etc/OpenCL/vendors clinfo
Number of platforms                               1
  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 3.0 CUDA 12.2.148
...
bartoldeman commented 6 months ago

This is fixed now. Upstream gentoo started prefixing the /etc/OpenCL directories but we don't have anything there, so I removed that change for us. Ref: https://github.com/ComputeCanada/gentoo-overlay/commit/2bfabde649d7ff2b8ba8e07cc86925a03b734224

twhitehead commented 6 months ago

Just gave it a a go, and it works good. Thanks! :+1: