Sergio0694 / ComputeSharp

A .NET library to run C# code in parallel on the GPU through DX12, D2D1, and dynamically generated HLSL compute and pixel shaders, with the goal of making GPU computing easy to use for all .NET developers! 🚀
MIT License
2.77k stars 124 forks source link

Cover additional scenario to replace C++ AMP #646

Closed electro-logic closed 11 months ago

electro-logic commented 1 year ago

Hello, I used a lot C++ AMP for scientific computation and now that is deprecated I'm looking for an alternative solution. As many out there, I am using C++ AMP code with C# / WPF / UWP, so ComputeSharp can really help to have a single C# code base.

There are few missing features in ComputeSharp to fully replace C++ AMP:

1) CPU/GPU code Sharing. The same C# code should be able to execute on both the GPU and the CPU. For example if the code use the float3x3 datatype to do some multiplications, should execute on the GPU or as plain C# code on the CPU for debugging purposes or to reuse the code. In C++ AMP there is a restrict() keyword to achieve this.

2) Improve Shader code compilation. Static readonly fields should be supported. A common scenario is to have some static methods and fields already written and tested in C# and to call them from a Shader.

2) Debugging support for shaders (at least when WARP is used) and a way to check the generated shader code. More documentation is needed on the wiki about debugging.

3) Protect the shader source code. Currently the C# shader source code is compiled into the assembly too. A workaround is to adorn with #if DEBUG the code inside Execute(), but it's not an elegant solution.

4) Provide a way (custom attribute?) to expose multiple functions. The scenario here is to have a class/struct that expose multiple GPU accelerated methods instead of having multiple struct that implement IComputeShader.

Thank you

TechScribe-Deaf commented 1 year ago

On the first item you listed, this would likely fall outside of the intended goal of ComputeSharp by making an assumption about what resources should be utilized for an application and the onus of implementing this should be on the developer instead. That not to mention the synchronization issue that may arise from trying to run compute code on both GPU and CPU. If you're talking purely CPU-only and GPU-only, then sure.

But the developers of ComputeSharp would end up having to write a different generator code for CPU-only.

electro-logic commented 1 year ago

@TechScribe-Deaf yes, the code should be compatible with CPU and GPU, but not running on both at the same time.

Maybe the best would be to write a calling code like this:

[Restrict(Cpu | Gpu)]
public static class MyFunctions {
  private static float3x3 M1 = new float3x3( .. );
  public static float3 DoSomething(float3 dataIn) {  return M1 * dataIn; }
  public static float[] DoSomething2(float[] dataIn)  {  ... }
}

and call this code on the CPU in the standard way var result = MyFunctions.DoSomething(new float3(1f,2f,3f));

OR on the GPU in a simple way, for example: var result = await GraphicsDevice.GetDefault().RunAsync(new float3(1f,2f,3f), MyFunctions.DoSomething);

where the attribute [Restrict(Gpu)] is constraining the code to CPU and GPU-compatible functions/data types at compile-time.

To achieve the CPU compatibility we probably only need an actual implementation for HLSL type (for example operators for float3, float3x3, etc..)

Sergio0694 commented 11 months ago

"There are few missing features in ComputeSharp to fully replace C++ AMP"

"Fully replacing C++ AMP" has never been a goal for ComputeSharp in the first place. This library is meant to enable compute workloads on the GPU via DirectX 12, rendering custom content via DX12 swapchains, and implementing custom D2D pixel shader effects, with seamless integration with XAML and Win2D where applicable. These are the core supported scenarios.

"CPU/GPU code Sharing. [...] should execute on the GPU or as plain C# code on the CPU for debugging purposes"

Code sharing is achieved via WARP, which lets you run the shaders on the CPU as well if needed. Running the shaders as C# code is a non starter. It would be a monumental amount of work to implement, it would just not be feasible in all scenarios where HLSL is done something that's just not expressible in C# (eg. swizzling operations), and for debugging purposes it would also be pretty much useless, as you would not be debugging the same code the GPU would execute anyway.

For debugging, you can use PIX (via ComputeSharp.Pix) and other GPU profiles to help, depending on the scenario.

"Static readonly fields should be supported"

They have been supported since forever:

https://github.com/Sergio0694/ComputeSharp/blob/f6326c82490b9de369359ef2b8140e848fac7137/samples/ComputeSharp.SwapChain.Shaders.Shared/FourColorGradient.cs#L15-L18

"Protect the shader source code"

Obfuscation is a non goal (and also just generally speaking a fool's errand). People determined enough will always be able to reverse your code (eg. they can pull out the shader bytecode and reflect it). But they won't, because nobody cares.

"Provide a way (custom attribute?) to expose multiple functions. The scenario here is to have a class/struct that expose multiple GPU accelerated methods instead of having multiple struct that implement IComputeShader"

This is already available, via [ShaderMethod] (see docs on "shader metaprogramming"). But it's being removed in 3.0, because it was way too much complexity and not really worth it.


Hope this clarifies things. I'm going to close this as not planned 🙂

electro-logic commented 11 months ago

Hello Sergio,

Thank you for the reply.

In C++ AMP code sharing is possible only when code is compatible with both CPU and GPU. I was thinking something similar, not to emulate special GPU instructions (ex. swizzling) on the CPU. The only expection would be the float3x3 data type and similar that would be nice to map to some .NET data type to reuse some methods on the CPU. Of course to "promote" ComputeSharp to a kind of C++ AMP successor need to involve a team. I'm probably visionary.

About the obfuscation: I was not suggesting any obfuscation, but to avoid to include unnecessary C# ComputeShader code in the Assembly. Is the Execute method implementation needed in the final Assembly? After the shader bytecode compilation we can replace the Execute() code with a NOP (to save memory too).

About static fields: I tried to use a static class with static field in a shader in this way (to reuse some static routines)

public static class MyStaticOperations
{
    readonly static float ALPHA = 1.0f;
    public static float GetAlpha()
    {
        return ALPHA;
    }
}

[AutoConstructor]
[EmbeddedBytecode(DispatchAxis.XY)]
readonly partial struct TestStatic : IComputeShader
{
    public readonly IReadWriteNormalizedTexture2D<float4> image;
    public void Execute()
    {
        image[ThreadIds.XY].RGB = new float3(0.5f, 0.5f, 0.5f);
        image[ThreadIds.XY].A = MyStaticOperations.DoSomething(); ;
    }
}

But I received a compilation error

error CMPS0046: The shader of type TestStatic failed to compile due to an HLSL compiler error (Message: "The DXC compiler encountered one or more errors while trying to compile the shader: [error]: use of undeclared identifier 'MY_CONST' return MY_CONST; . Make sure to only be using supported features by checking the README file in the ComputeSharp repository: https://github.com/Sergio0694/ComputeSharp. If you're sure that your C# shader code is valid, please open an issue an include a working repro and this error message.") (https://github.com/Sergio0694/ComputeSharp)

So i was thinking static fields were not supported.

Thank you

Sergio0694 commented 11 months ago

"About the obfuscation: I was not suggesting any obfuscation, but to avoid to include unnecessary C# ComputeShader code in the Assembly. Is the Execute method implementation needed in the final Assembly? After the shader bytecode compilation we can replace the Execute() code with a NOP (to save memory too)."

Oh, I see. That is already done automatically by the compiler when you enable trimming. If you build with trimming enabled (or using NativeAOT, which also requires trimming), you'll see all the Execute methods are completely removed, as the linker is able to see that they're never actually executed. So they are effectively free, and just vanish when compiling 🙂

Eg. you can see this for yourself using sizoscope on the ComputeSharp.NativeLibrary sample project in the repo. You can publish with NativeAOT, open the MSTAT file and see that the Execution method is completely trimmed away and just gone.

"But I received a compilation error"

I think you probably hit either #298 or #549. I should probably find some time to address those eventually 😄

electro-logic commented 11 months ago

I tried ComputeShader with WPF and NativeAOT / Trimming is not available yet. Anyway good to know that these features are fixing the issue.

I think you probably hit either https://github.com/Sergio0694/ComputeSharp/issues/298 or https://github.com/Sergio0694/ComputeSharp/issues/549. I should probably find some time to address those eventually 😄

This bug is really stopping me (and others I think) from evaluating this library. Would be cool to see this fixed in the next release.