ProjectPhysX / FluidX3D

The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs via OpenCL. Free for non-commercial use.
https://youtube.com/@ProjectPhysX
Other
3.48k stars 281 forks source link

Less brittle Linux OpenCL setup instructions #187

Closed jansol closed 1 month ago

jansol commented 1 month ago

Having instructions for installing OpenCL drivers on Linux is nice... but the current ones are bypassing most of the duct tape that distro maintainers frantically apply in order to keep your system from imploding without warning during a future update. This is not ideal.

Seeing that the instructions appear to target Debian and/or Ubuntu, it is possible to set things up in a much easier and more robust way, without having to add 3rd party package repos or manually install build tools (other than the ones needed for building FluidX3D itself):

  1. Install apt install ocl-icd-libopencl1 ocl-icd-opencl-dev (provides a driver-agnostic libOpenCL.so. OpenCL applications should be linked against this rather than directly against a specific driver)

For Intel and AMD GPUs it is also possible to use the Mesa OpenCL ICD (apt install mesa-opencl-icd). This does have some caveats that make it difficult to recommend as the default but. Depending on the distro can contains two different implementations:

Even Debian Stable is up to date enough for this approach.

ProjectPhysX commented 1 month ago

Hi @jansol,

thanks a lot for your help! I have updated the instructions!

Some questions and comments:

Kind regards, Moritz

jansol commented 1 month ago

Yeah, Rusticl would be the most convenient option for recent GPUs but as I said it's still difficult to recommend. The main advantage it has over Clover is that it works at all on RDNA and (to some extent) Arc GPUs. Currently it's mostly interesting for low-friction experimenting since it is the most "native" option for many systems.

PoCL being slower on Intel CPUs than Intel's driver is well known (although I'm surprised the difference is that drastic). This is improved somewhat by a new TBB-based scheduler that was added recently (pocl/pocl#1387). I'm guessing the Intel Runtime also has a compiler that is much more aggressively tuned for their architectures as well as for per-uarch kernel optimizations. PoCL is somewhat conservative with the prebuilt "builtin" function optimizations in distro builds, only generating them based on vector instruction sets rather than specific uarchs (for custom builds it does optimize them specifically for the host CPU).

CPU detection is done by LLVM, so you'll have to check which LLVM version your PoCL was built against - that LLVM might not know about Raptor Lake yet? (On Debian it looks to be still on LLVM15 which came out before Raptor Lake) If it should know that uarch then it sounds like a possible bug in PoCL, LLVM or both. There is also a possibility that PoCL puts "haswell" in the string because that is the uarch used as the reference with a feature set that matches what it detected.

It's also worth noting that thanks to the ICD loader, it is possible to have multiple drivers installed -- even for the same device -- as you can use OCL_ICD_VENDORS to choose which one(s) should be visible on a per-application basis if needed.

ProjectPhysX commented 1 month ago

The haswell-naming is happening with latest PoCL 5.0+debian on Ubuntu 24, should be LLVM 18, see this opencl.gpuinfo.org entry. Closing this issue now, thanks for the help!