Closed Hideman85 closed 2 months ago
In the end I found the right way to do it as follow:
SyncToken token = {};
BufferLoadDesc desc = {};
desc.mDesc.mDescriptors = DESCRIPTOR_TYPE_RW_BUFFER_RAW;
desc.mDesc.mFlags = BUFFER_CREATION_FLAG_PERSISTENT_MAP_BIT | BUFFER_CREATION_FLAG_HOST_VISIBLE | BUFFER_CREATION_FLAG_HOST_COHERENT;
desc.mDesc.mMemoryUsage = RESOURCE_MEMORY_USAGE_GPU_TO_CPU;
desc.mDesc.mStartState = RESOURCE_STATE_SHADER_RESOURCE;
desc.mDesc.mFormat = TinyImageFormat_R32_SFLOAT;
desc.mDesc.mSize = NB_ELEMENTS * sizeof(float);
desc.mDesc.mElementCount = NB_ELEMENTS;
desc.mDesc.mStructStride = sizeof(float);
desc.mDesc.mNodeIndex = pCompute->mUnlinkedRendererIndex;
desc.ppBuffer = &pComputeBuffer;
addResource(&desc, &token);
waitForToken(&token);
float* data = (float*)pComputeBuffer->pCpuMappedAddress;
The rest is already above.
I am in the phase of learning The Forge and I'm trying to get some help with this topic because I'm getting really confused right now.
I would like to be able to run some compute shader on my iGPU and profit of the shared memory with the CPU (Read/Write without transfer/same memory space).
So right now, I try with simple example, a compute shader that double each float of my array/buffer.
My shader double.comp.fsl
```hlsl RES(RWBuffer(float), myData, UPDATE_FREQ_NONE, b0, binding=0); // Main compute shader function NUM_THREADS(8, 8, 1) void CS_MAIN(SV_GroupThreadID(uint3) inGroupId, SV_GroupID(uint3) groupId) { INIT_MAIN; myData[inGroupId.x] *= 2.0; // Simple operation: double each float RETURN(); } ```I'm able to find my integrated GPU
```c++ Renderer *pRenderer = nullptr; Renderer *pCompute = nullptr; void MyApp::Init() { RendererContextDesc contextSettings = {}; RendererContext* pContext = NULL; initRendererContext(GetName(), &contextSettings, &pContext); RendererDesc settings = {}; // Need one GPU for rendering and one for compute to simplify if (pContext && pContext->mGpuCount >= 2) { uint32_t queueFamilyCount = 0; VkPhysicalDeviceMemoryProperties memProperties; auto SHARED_FLAG = VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT | VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT; int bestGpuIndex = -1; int bestProfile = -1; struct IntegratedComputeGPU { int idx; uint32_t mem;}; std::vectorShader, RootSignature, Pipeline, all good
```c++ void Compute::AddShaders() { ShaderLoadDesc desc = {}; desc.mStages[0].pFileName = "double.comp"; addShader(pCompute, &desc, &pComputeShader); } void Compute::RemoveShaders() { removeShader(pCompute, pComputeShader); } void Compute::AddRootSignatures() { RootSignatureDesc desc = { &pComputeShader, 1 }; addRootSignature(pCompute, &desc, &pRootSignature); }; void Compute::RemoveRootSignatures() { removeRootSignature(pCompute, pRootSignature); } void Compute::AddPipelines() { PipelineDesc pipelineDesc = {}; pipelineDesc.pName = "ComputePipeline"; pipelineDesc.mType = PIPELINE_TYPE_COMPUTE; ComputePipelineDesc& computePipelineSettings = pipelineDesc.mComputeDesc; computePipelineSettings.pShaderProgram = pComputeShader; computePipelineSettings.pRootSignature = pRootSignature; addPipeline(pCompute, &pipelineDesc, &pPipeline); }; void Compute::RemovePipelines() { removePipeline(pCompute, pPipeline); } ```Now the part that I think I'm getting wrong, I try to create a buffer to the GPU from the existing memory 🤔
addBuffer()
```c++ // Taken from the The Forge renderer DECLARE_RENDERER_FUNCTION(void, addBuffer, Renderer* pCompute, const BufferDesc* pDesc, Buffer** pp_buffer) DECLARE_RENDERER_FUNCTION(void, removeBuffer, Renderer* pCompute, Buffer* pBuffer) Buffer* buff = nullptr; std::vectorI would kindly appreciate help to get a simple example working 🙏 Thanks in advance 🙏