The render and dispatch stages, as I will call them here, are currently combined into one procedure. I will give short description of what I mean with the stages.
Render stage here means to figure out what render state needs to be set and what data is needed in order to draw an object.
Dispatch stage would be where we actually upload our resources, set render state and dispatch the draw calls.
There is also an earlier stage in Renderer, where objects and control commands are added to a command list to be sorted in the right order relating to viewports and render passes. This stage does not need to change to complete this task.
Here are some notes for the technical implementation.
Render stage would build an internal representation of the actual to render commands we are going to do.
The command representation needs to contain all data needed to submit the commands to the GPU.
The separation of stages also needs to be done for debug drawing and GUI drawing.
Any data that is uploaded to buffers or textures also needs to be copied or moved to the internal representation.
We might want to have some threshold for what size of data is copied directly into the command stream and what is stored separately.
When executing commands, the executor could contain the previously bound state to not unnecessarily bind the same state again.
Any information coming back from the render command calls are a complication. For example creating any resources (buffers, textures, framebuffers, samplers), return the IDs of the created resources. These IDs need to be able to be used immediately after encoding the command. The normal use case is to create a resource, and immediately update its properties and data. This means that we need to create some kind of ID immediately when the command is encoded and then map that ID to the actual ID assigned by the driver.
Another issue are shader compilation and uniform location retrieval. It might be necessary to have a method of running GPU commands synchronously, since these tasks would otherwise be incredibly hard to accomplish. For now, it is possible to just use existing RenderDevice interface for that. In the future, though, we would need a way to run these synchronous command blocks on the correct thread.
Separating the render and dispatch stages is a prerequisite to moving rendering to another thread. The data can be built on multiple threads if needed, since we are not yet calling any GPU commands. Then we can move the data to a separate render thread where it is a very simple job to submit the actual GPU calls from the command data. Using multiple threads to build or execute the data is not part of this task.
The render and dispatch stages, as I will call them here, are currently combined into one procedure. I will give short description of what I mean with the stages.
There is also an earlier stage in Renderer, where objects and control commands are added to a command list to be sorted in the right order relating to viewports and render passes. This stage does not need to change to complete this task.
Here are some notes for the technical implementation.
Any information coming back from the render command calls are a complication. For example creating any resources (buffers, textures, framebuffers, samplers), return the IDs of the created resources. These IDs need to be able to be used immediately after encoding the command. The normal use case is to create a resource, and immediately update its properties and data. This means that we need to create some kind of ID immediately when the command is encoded and then map that ID to the actual ID assigned by the driver.
Another issue are shader compilation and uniform location retrieval. It might be necessary to have a method of running GPU commands synchronously, since these tasks would otherwise be incredibly hard to accomplish. For now, it is possible to just use existing RenderDevice interface for that. In the future, though, we would need a way to run these synchronous command blocks on the correct thread.
Separating the render and dispatch stages is a prerequisite to moving rendering to another thread. The data can be built on multiple threads if needed, since we are not yet calling any GPU commands. Then we can move the data to a separate render thread where it is a very simple job to submit the actual GPU calls from the command data. Using multiple threads to build or execute the data is not part of this task.