m4rs-mt / ILGPU

ILGPU JIT Compiler for high-performance .Net GPU programs
http://www.ilgpu.net
Other
1.38k stars 117 forks source link

Would ILGPU be able to take advantage of .NET 7's upcoming AOT compile feature? #847

Open LouChiSoft opened 2 years ago

LouChiSoft commented 2 years ago

Forgive my ignorance about .NET and MSIL I am not a .NET developer and I am very unfamiliar with how it functions in the back end. Anyway, .NET 7 is adding the ability to compile projects to a native binary instead of the JIT compiler and I wondered if ILGPU will support the ability to compile the application in that mode. As I understand it the AOT mode disables the ability to compile MSIL on the fly which ILGPU uses. But given that MS is stating that in the context of CPU code and the MSIL that ILGPU uses would be converted to CUDA or OpenCL code which does support being compiled and loaded at runtime would it be possible to have a mechanism that enables that feature at least for the GPU accelerators as the CPU version would almost certainly not work.

MoFtZ commented 2 years ago

hi @LouChiSoft, I'm not very familiar with Native AOT functionality, but at this stage it looks like it would not work with ILGPU.

The first issue I see is related to ILGPU's use of Reflection. In particular, Reflection is used to read the MSIL code at runtime. This article appears to document various modes of Native AOT support for Reflection, but it was not immediately clear how to enable these modes. In theory, it should be possible to pick mode that is compatible with ILGPU.

The second issue is likely to be the showstopper. According to the deployment notes, under "Limitations of of Native AOT deployment", is listed "No runtime code generation (for example, System.Reflection.Emit)". ILGPU makes use of runtime code generation for the kernel launcher. I am not sure if this is a permanent or temporary limitation of Native AOT. If permanent, ILGPU would need to be modified to workaround the limitation.

LouChiSoft commented 2 years ago

Ah, that is a shame. I didn't realise that the kernel launcher code was also dynamically generated at runtime as well, I assumed that it was only the CPU compatible kernel code that would have been generated. For our current project we use some (admittedly not very pretty) compile time code generation that handles launching our OpenCL kernels.

I ask since we have a project (different to the one I mentioned in my other ticket) that we need to compile to machine code before release since releasing byte-code or source code is not an option and if ILGPU worked in native AOT mode that would mean we could develop in an interpreted environment but publish a native machine code binary.

Af for if things like System.Reflection.Emit is a permanent limitation or not there doesn't seem to be much information about the roadmap of such feature so I wouldn't be able to tell you. But if it is permanent I would get if that would push native AOT mode outside of scope for the project

rcollette commented 2 years ago

I'm not all that familiar with the overall architecture of ilgpu but I wonder if source generators might offer a path that doesn't involve runtime reflection.
https://docs.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/source-generators-overview

I know that that they generate c# code rather than IL but I'm wondering if the generated C# code could then create the CUDA/OPENCL code at runtime rather than having to reflect over IL.

LouChiSoft commented 2 years ago

There's always work arounds, I guess the issue here is how much of a workaround does it have to be before it's no longer in the scope of the project. I imagine a lot of the project is tied to converting IL to kernels and while I am sure that some of that work is transferable to another methods of creating kernels it would still need some work and testing.

Not to talk you out of potentially supporting AOT compilation 😊 if you feel like it can be done

MoFtZ commented 2 years ago

Thinking about this a bit more, if there was a way for the application to detect that it was compiled using Native AOT, it would be reasonably straightforward to modify ILGPU to avoid using runtime code generation.

The next part is the use of Reflection. It sounds like Native AOT would provide a configuration mode to allow the use of Reflection, however, it would prevent @LouChiSoft from using it since it would likely result in the MSIL bytecode being included in the final executable. The solution for that would be source generators, as proposed by @rcollette.

LouChiSoft commented 1 year ago

Hi, I was just wondering if there has been any more thought about this ticket. AOT has been out for several months now so I was wondering if there has been anymore information with relation to whether or not it would be possible to for full AOT compilation. Or maybe compile the GPU code to something that is more obscured than MSIL.

I haven't found a true solution to this issue yet in almost any language in the time I originally made the ticket so I figured I would ask again

MoFtZ commented 1 year ago

hi @LouChiSoft. Unfortunately no, I have not spent any time in trying to resolve this issue.

The first topic, getting ILGPU to work with Native AOT, could actually be harder than I initially thought. I originally mentioned the use of runtime code generation for the kernel launchers. However, ILGPU also creates runtime types for internal use. We would need to test to see if this is usable with Native AOT. If not, then we would need to use Source Generators (see below).

The second topic, getting ILGPU to obfuscate the GPU/MSIL code, AFAIK this can only be achieved by updating ILGPU to use Source Generators. I'm not sure if this would cause reduced functionality in ILGPU, since some functionality relies on Reflection. Also, there is potentially quite a bit of work to modify ILGPU to work with Source Generators.

@m4rs-mt Did you have any thoughts on this?

LouChiSoft commented 1 year ago

Hi @MoFtZ thanks for getting back to me. That's totally fine, I just figured I would ask since I know that AOT is potentially a game changer for some C# projects. And even if it was something that could be achieved I wouldn't expect it to be something that changes overnight but rather something I could keep track of for my team

LouChiSoft commented 1 year ago

Hi again, I was just wondering if there is an opportunity to further this ticket myself? With .NET 8 AOT releasing soon it has the potential to have a significant performance increase over .NET 7. At least beased on the MS blog posts. Like I mentioned in the first message I am not a full time C# dev. But I am willing to learn and if someone was there to maybe help guide me I would like to look into how ILGPU creates kernels and trying to evaluate what work needs to be done to enable AOT. Thanks

MoFtZ commented 1 year ago

hi @LouChiSoft. Since our last discussions, there was an interesting new feature that might be helpful. In C# 12, there is an experimental concept called Interceptors.

This would potentially make it possible for a Source Generator to replace the call to ILGPU LoadXxxKernel with a custom implementation, and for this to define all the additional runtime types, generate the GPU code and obfuscate it.

I have not investigated this, beyond just reading about the concept. However, I was working on a more traditional Source Generator in this PR.

LouChiSoft commented 1 year ago

Oh, that is really interesting, I didn't know about interceptors. It's nice to see MS expanding the source generation tools, which if AOT keeps improving (.NET 8 will be adding a fair few improvements according the the blogs) will certainly come in use for frequently

Nice to see that a source gen PR. I'll have a look to see what's in it

dellamonica commented 3 months ago

Could anybody provide some guidance as to how to work with ILGPU using AOT compilation? Is there a manual way to compile the kernels (say, as part of the build process) and load them manually without reflection/IL emit?

Of course, it is much more ergonomic to have source generators do the work, but even before having those, it should be possible to create a PoC project that works in AOT mode. Is AOT support planned for a future release?

BTW, I'm also using cuFTT, and I'm not sure if that API is AOT compatible.