microsoft / DirectX-Graphics-Samples

This repo contains the DirectX Graphics samples that demonstrate how to build graphics intensive applications on Windows.
MIT License
5.89k stars 2k forks source link

Sample Request: D3D12 Hello Compute #384

Open wallisc opened 6 years ago

wallisc commented 6 years ago

Currently the only existing D3D12 compute sample is the D3D12NBody simulation. For those interested in DX12 for purely compute reasons, particularly with no DX12 background, the sample can be more daunting than it needs to be. A simple compute sample that's similar to D3D12 Hello World would be valuable. I'm thinking something along the lines of: • Create a Compute PSO • Dispatch2D that does something drop dead simple to a UAV (draw a circle using threadID + equation of a circle) • Copy from the UAV to a back buffer • Present

Having a D3D12 Hello Compute sample has the potential of being simpler than even the Hello World sample since compute requires less setup and is more generic than the 3D pipeline.

walbourn commented 6 years ago

Note that we have some DirectX 12 "Introductory Graphics" samples on Xbox-ATG-Graphics as well, including SimpleCompute

krupitskas commented 5 years ago

I think I can do it if it still actual and it will be merged in repository. Im currently looking in compute. Single tutorial what I found is http://www.codinglabs.net/tutorial_compute_shaders_filters.aspx. I think as an example, we can do same that they did. Load the image and desaturate/draw circles.

krupitskas commented 5 years ago

Also I can write after a wiki for that sample, for example

stanard commented 5 years ago

How about something even simpler? Read an RGB SRV and write a Luminance UAV, i.e. convert to greyscale. A wiki would be welcome. Thanks for the offer!

zezba9000 commented 4 years ago

This example should really exist. Its much much easier to test and develop stuff on Win32 desktop PC.

MathiasMagnus commented 3 years ago

In my read, "Hello, compute!" is SAXPY. No graphics, no window, no textures... just a console: dispatch, compute, fetch, print to console. SYCL-SAXPY, OpenCL-SAXPY, HIP-SAXPY... that's the simplest you can do.

walbourn commented 3 years ago

The BasicCompute sample for DirectX 11 in the legacy DirectX SDK was a "A + B" computation. ComputeShaderSort11 was a Bitonic sort. They were both console apps.

These samples are available on GitHub without any legacy DXSDK dependencies.

MathiasMagnus commented 3 years ago

@walbourn Thanks for the reply. I'm primarily a GPGPU dev and my graphics skills go as far as scientific visualization goes (simplest of shaders, I don't "live and die by the frame", I leave that sort of enthusiasm for compute). I'm versed in most GPGPU APIs (OpenCL, SYCL, HIP, CUDA, C++AMP) and I can draw simple stuff in OpenGL (using mostly OpenCL-OpenGL interop). Naked DirectX has eluded me thus far but I'd like to learn at least one low-level graphics API. (Vulkan is on the list too)

AFAICT DX11 and DX12 are very much different (much like OpenGL and Vulkan, but slightly less). The setup in DX12 is just as excessive as one might expect. I'm reading the docs, extracting some patterns from the graphics-compute samples of DX12 and I'm almost at the point where I dispatch my shaders. My half-baked sample can be found here. It builds using CMake, there is CLI arg parsing using TCLAP (installable via Vcpkg), but aside from that it should be standalone. I'm not against someone patching it up or giving constructive criticism. ;)

(Writing naked while loops to enumerate available devices is something I refuse doing, so I wrote an input iterator for DXGI adapters. It's not using DXCore, but I should update it.)

zezba9000 commented 3 years ago

@MathiasMagnus The project I started: https://github.com/reignstudios/Orbital-Framework Will allow super easy agnostic / cross-platform setup & agnostic Computer Shaders etc eventually can be written in C#.

One of the reasons for it is to simplify setup & basic 3D api work so you can get to what matters. Making games, engines or tools.

Its not rdy for use but it just sounds like something people in your boat would find useful in the future. I had a very very close family member (a parrot) who was expected to live 60 years die recently & am finishing a game in their honor before I get back to my GitHub projs. But I figured I'd share as its meant to simplify your kind of condition without hindering critical features.

DTL2020 commented 2 years ago

The BasicCompute sample for DirectX 11 in the legacy DirectX SDK was a "A + B" computation. ComputeShaderSort11 was a Bitonic sort. They were both console apps.

These samples are available on GitHub without any legacy DXSDK dependencies.

The difference between DX12 setup and DX11 is significant. It is great for DX12 guru to make Hello-Compute sample for totally new DX12 environment. The microsoft documentation shows Direct3D 12 compute pipeline at https://docs.microsoft.com/en-us/windows/win32/direct3d12/pipelines-and-shaders-with-directx-12 . It significantly lower in size and objects count in compare with Direct3D 12 graphics pipeline, So I assume we need more or less simplier setup in compare with 'graphics' Hello-Texture and Hello-Triangle samples ?

Currently it looks this sample - https://github.com/microsoft/Xbox-ATG-Samples/blob/master/UWPSamples/IntroGraphics/SimpleComputeUWP12/SimpleComputeUWP12.cpp and its function void Sample::CreateDeviceDependentResources() the most close to required 'Hello-Compute' sample. At least as startup. Unfortunately many DirectX samples uses different helper functions and syntax methods to call the single DX12 API so simple copy-paste required editing.

DTL2020 commented 2 years ago

Having more questions on more complex using of Compute Shaders:

  1. How to execute several Compute Shaders with loaded resources in a strict sequence order ? The API description https://docs.microsoft.com/en-us/windows/win32/api/d3d12/nf-d3d12-id3d12commandqueue-executecommandlists says "Applications are encouraged to batch together command list executions to reduce fixed costs associated with submitted commands to the GPU." . So how to best combine loading several different compute shaders for sequential execution using lowest number of lists/queues ? Can the different compute shader be loaded in the single command list ? Or in the sequence of command lists for the single command queue ?

I have set a resources Res1, Res2, Res3... ResN. And each compute shader resolves some work from a list of input resources to a list of output resources (compute data from Res1 to Res2). Next I need to switch resource state of Res1 from UAV to SRV and use them as sources for next compute shader (compute data from Res2 to Res3) and so on. Can it be done in the some single command sequence to accelerator without calling CommandList->Close() CommandQueue->ExecuteCommandLists(1, &computeList); waiting for first shader to finish -> switch resources state -> reset command allocator and list ->record new list with next compute shader ->close ->Queue-Execute->wait...

Can the sequence of compute shaders be sent to accelerator for execution with guaranteed sequential order with one API call ExecuteCommandLists() and one waiting event ? It should be faster ?

sebmerry commented 2 years ago

@DTL2020 yes, this is exactly what ResourceBarrier() is intended to do. So you can call (Dispatch 1, ResourceBarrier, Dispatch 2, Close) and the resource barrier will ensure that Dispatch 1 and Dispatch 2 do not overlap.

DTL2020 commented 2 years ago

yes, this is exactly what ResourceBarrier() is intended to do. So you can call (Dispatch 1, ResourceBarrier, Dispatch 2, Close) and the resource barrier will ensure that Dispatch 1 and Dispatch 2 do not overlap.

Thank you for support. But how to set different compute shaders between Dispatch and ResourceBarrier calls ? With calls SetPipelineState(); with different PipelineStateObjects with different loaded compute shaders into it ?

So correct execution of many compute shaders in a sequence is: SetPipelineState(PSO1 with compute shader 1); Dispatch() ResourceBarrier(Res2 - State UAV to SRV) SetPipelineState(PSO2 with compute shader 2); Dispatch() ResourceBarrier(Res3 - State UAV to SRV) ?

sebmerry commented 2 years ago

yes, this is exactly what ResourceBarrier() is intended to do. So you can call (Dispatch 1, ResourceBarrier, Dispatch 2, Close) and the resource barrier will ensure that Dispatch 1 and Dispatch 2 do not overlap.

Thank you for support. But how to set different compute shaders between Dispatch and ResourceBarrier calls ? With calls SetPipelineState(); with different PipelineStateObjects with different loaded compute shaders into it ?

So correct execution of many compute shaders in a sequence is: SetPipelineState(PSO1 with compute shader 1); Dispatch() ResourceBarrier(Res2 - State UAV to SRV) SetPipelineState(PSO2 with compute shader 2); Dispatch() ResourceBarrier(Res3 - State UAV to SRV) ?

Yes, that is correct

zarsr commented 1 year ago

Hi. I'm from OpenCL and CUDA background. I'm new to DirectX Compute Shaders and I started with DIrectX11 sample https://github.com/walbourn/directx-sdk-samples/tree/main/BasicCompute11

@sebmerry @DTL2020

Next I need to switch resource state of Res1 from UAV to SRV and use them as sources for next compute shader (compute data from Res2 to Res3) and so on

Could you please help me with how to do this in DirectX 11 compute shader.

I tried a convolution compute shader with one input and one output, so only one input SRV and one output UAV buffers are created. But I want to launch more compute shaders in sequence and I would need some help in how to convert output UAV of first compute shader as input SRV of next shader and so on.

I also created SRV buffer for input arguments of Compute Shader(For example: width and height of image). Is there a best way to set compute shader arguments (like cuda kernel arguments) than using SRV buffer?

XezvickGames commented 1 week ago

Should be good enough? https://github.com/XezvickGames/D3D12ComputeExamples