Open fknfilewalker opened 2 years ago
In case of OCL the answer is no, unless you're using some custom compiler driver and frontend which are able to distinguish between the CPU host and device code and supply them to different backends (x86 and spir-v for example) which in LLVM are in fact different compiler invocations. That problem is successfully solved in other GPGPU languages like SYCL, and AFAIK MLIR allows you to have both host and device IR in a single file.
How does it work that cuda can do this? Isn't nvcc also using clang? If nvcc is just a wrapper around clang and the part that compiles cuda code, then it should also be possible with this?
CUDA realies on directives such as __device__
, __host__
and __global__
to drive what goes on the host or the device. OCL doesn't have such things (the specs even has some requirements that would made this difficult).
nvcc is a custom compiler which does a source to source compilation to split host code (IIRC outputs a preprocessed C++ file) and device code (which it compiles to PTX/SASS). Clang supports CUDA and works by having the front-end to filter out what is goes on the host or the device depending on the compilation mode.
But the important bit is CUDA is design to be a single source while opencl is designed to be kernel only (and initially be fully compiled online). Historically SYCL was about having single source with opencl (but evolved to go way beyond with 2020).
Wouldn't it be possible to have clang specific notations for gpu-only functions (entry functions) like [[spirv::kernel]]
or [[spirv::vert]]
, [[spirv::frag]]
and [[spirv::global]]
for functions that can be executed from cpu and gpu code? These two types of functions would then have a language feature limit of what the gpu target can provide. The gpu-only functions would support stage specific build-ins which global functions would not. Is this not doable since all of the parts are basically within llvm?
@fknfilewalker That's not only doable, I've done exactly that. You can embed SPIRV semantics into a C++ frontend. I did this two years ago for OpenGL/Vulkan graphics SPIR-V. I even chose the same attribute spellings that you have there: https://github.com/seanbaxter/shaders#shader-attributes https://github.com/seanbaxter/shaders/blob/master/cube/cube.cxx in/out/unifor variables too
I've been looking forward to this project creating a proper SPIR-V compute target so I can embed real OpenCL compute into a C++ application. For SPIR-V graphics I had to write my own SPIR-V target, and it's kind of buggy (due to structured control flow semantics, which are required for graphics but not for compute). When this LLVM-SPIRV-Backend project is put into LLVM, it will enable what you describe.
@seanbaxter but you still have to customize clang driver to invoke clang twice - first for cpu codegen and generating some glue code to call your kernel (typically some OpenCL API functions), second for the actual SPIR-V (or other device ISA) generation
@zuban32 True true. Once this LLVM target is ready, I hope it will spur more frontend work to improve the developer experience.
Does this allow to have opencl code in the same file as the c++ code and compile it directly into the cpu side bin?