intel / llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Other
1.22k stars 730 forks source link

Support .sycl file extension #15015

Open dvrogozh opened 1 month ago

dvrogozh commented 1 month ago

Please, help to support dedicated .sycl file extension. This feature is needed to automate limiting dpc++ usage to sycl files in the projects which 1) are using non-dpc++ compiler (gcc, clang) for c/c++ source and 2) not willing to switch to dpc++ for existing c++ code. This will also help projects to stand out sycl kernels for easier search and maintenance.

Pytorch is an example of such project. Specifically, this feature is needed to help support torch.utils.cpp_extension.load() pytorch API (see https://github.com/pytorch/pytorch/issues/132944). This API provides users a capability to build pytorch extensions containing device specific kernel code. Users can combine native c++, sycl code and cuda code. Example:

        module = torch.utils.cpp_extension.load(
            name="my_extension",
            sources=[
                "extension.cpp",
                "cuda_extension.cu",
                "xpu_extension.sycl",
            ]
        )

Workaround: At the moment this can be handled by passing -x c++ file.sycl to dpc++ compiler. However, we would like to make sure that dpc++ compiler has that in plans to standardize upon .sycl extension to avoid divergence in the future. We are ok to consider different sycl-dedicated extensions if .sycl does not seem right for any reason.

These are PRs on pytorch ecosystem side which are getting use of the above WA:

TApplencourt commented 2 weeks ago

dpcpp have also this -fno-sycl-rdc who help compilation time. Do you want a .sycl_no_rdc?

The same problem should appear with OpenMP Target offload (not sure if Pytorch have a OpenMP target offload, but look like they have an OpenMP one), how do you want to resolve it?

It may also cause a problem with function calls. If a sycl kernel in a.sycl calls a function which is in b.cpp, should .b.cpp be called b.sycl? So, .sycl is required for all the transitive dependencies?
But what append is b.cpp, which is also used by openmp code? My openmp compiler will need to understand .sycl?

Also, what will happen to "multiple-language" code, who have both social and OpenMP in the same file? Should they be .cpp+.sycl? I can compile sucl code with -fopenmp -fsycl without any issues currently.

Or for the kokkos portable library, which implements multiple backends. Depending on how the library was built, it may or may not use sycl. Should it have a .potentialy_sycl?

IMO, it's just a "normal" build-system problem: How to pass special flags to specific files. I don't see it as a SYCL problem.

dvrogozh commented 2 weeks ago

What I want is in the issue description - request is to have .sycl extension. No, I do not request any other extensions which you've listed in your comment. Whether it's reasonable and feasible to have dedicated extension is a thing to discuss, that's why this issue is filed. Similar discussion also needs to happen within clang/llvm community since sycl support presents there to certain degree as well with plans to extend. Having dedicated extension for files comes with certain assumptions, same as there are assumptions on the file with .cpp or .h or other extensions. Assumptions might be different, basically that's why extensions are needed - to help structure the sources and better communicate expectations over the code. Note that I do not request to remove -fsycl or any other options you've mentioned. Projects which don't want to use this for better structuring can still do what they were doing passing -fopenmp or -fsycl as they see needed.

TApplencourt commented 2 weeks ago

Similar discussion also needs to happen within clang/llvm community since sycl support presents there to certain degree as well with plans to extend.

I never happened for the OpenMP Target Offload AFAIK, who have exactly the same "problem".

kevin-harms commented 2 weeks ago

There is not basis to have .sycl. SYCL does not propose any language extensions. A sycl code can be consumed by any C++ compiler. In order to properly generate kernels and such, a SYCL aware compiler is needed but that dose not imply the need for a separate file extension. There is no .x86 extension for C++ code on x86 deices. If you want specific compiler flags for a specific file, then you need a mechanism to provide that. I'm not familiar with Torch, but it's possible they have that today to support specific optimization for specific files.

abagusetty commented 2 weeks ago

Given the paradigm of single source basis by SYCL, creating an extension sounds counter intuitive instead. My two cents, the current issue sounds more of a build-system maintenance. Renaming files to .sycl.cpp (from the idea of SYCLomatic file extensions with .dp.cpp) and treating as SYCL device-only sources with relevant SYCL-specific flags, with keeping the other .cpp files as host-only (or non SYCL sources) might suffice.

        module = torch.utils.cpp_extension.load(
            name="my_extension",
            sources=[
                "extension.cpp",
                "cuda_extension.cu",
                "xpu_extension.sycl.cpp",
            ]
        )
dvrogozh commented 2 weeks ago

I do not argue that this can be solved on build system level. Basically that's how it is solved right now in #132945.

However, I still do believe that it is reasonable to provide a way to logically separate sycl related code from everything else on the file system level. There are users, not me alone, who would prefer such approach from logical standpoint. One of the key usages for SYCL is to write kernels for GPU devices. And as such I prefer for such code to stand out and be separately configurable. Note that there are a number of SYCL and device specific flags which you typically need pass to such a code. I personally prefer to pass these flags only to the code which actually needs them. This in any case will be handled on build system level. But having sycl specific file extension helps build system to automatically recognize and configure such files. Same goes to considering SYCL as a language per-se by the way despite the fact it is based on C++. This is convenience. And this convenience does not prohibit to use original SYCL paradigm. It just provides more convenient and intuitive approach for certain domain use cases.

nliber commented 2 weeks ago

I don't understand what is keeping you from using .sycl for your own SYCL code. Could you elaborate? Better yet, get your colleagues who also prefer this to add their comments and concerns to this issue.

dvrogozh commented 2 weeks ago

I don't understand what is keeping you from using .sycl for your own SYCL code. Could you elaborate?

Compiler does not currently recognizes .sycl files as valid input and aborts with error. This can be bypassed by passing -x c++ on command line. My reasoning behind having .sycl file extension supported by default is explained in previous comments.

Better yet, get your colleagues who also prefer this to add their comments and concerns to this issue.

@gujinghui @EikanWang @fengyuan14 @guangyey @jgong5