Open msiglreith opened 7 years ago
Excellent summary! I agree with everything except listed below:
Async queues: Only provide one queue to the user
I'd like to see multiple queues exposed in render
, eventually. Since we aren't doing this now, it's fine to have this limitation upon ll
transition, but we should keep the possible exposure of the queues on our radar. Of course, it would sorta make sense to expose them now since you are reworking the render
interface anyway, and we don't know when the next good moment for this will come.
A common technique used to avoid stalling is to have a ring-buffer of command pools per thread.
What concerns me here is how sending command buffers between threads will work. Perhaps, we'll send the buffer itself, but the command buffer encoder (which is what needs the command pool) will be non-sendable?
let queue = match queue {
... // depending on the capabilities you need (compute?)
};
We may have sugar to avoid this for a general case of graphics-only work. E.g. impl GraphicsQueue for Queue
shim, or even impl DerefMut<Target=GraphicsQueue> for Queue
.
Good to hear! Thanks for the fast feedback!
What concerns me here is how sending command buffers between threads will work. Perhaps, we'll send the buffer itself, but the command buffer encoder (which is what needs the command pool) will be non-sendable?
If I understand it (vulkan) correctly, command buffers are not really supposed to be sent across threads as access to them (ie vkCmd....) requires external synchronization of the underlying command pool, so probably a lock per call. That's why I'm also exposing command buffers as non-Send
in core
. Only Submit
(generated after finishing the builiding of a command buffer) should be (is?) marked as Send
.
We may have sugar to avoid this for a general case of graphics-only work. E.g. impl GraphicsQueue for Queue shim, or even impl DerefMut
for Queue.
Yep, already implemented! Queues can be downcasted in the hierarchy if they support the required functionalities.
Regarding the command queues I have to think a bit. My motivation behind this was that the people who want to use multiple queues or async compute will probably go straight towards the core
and take care of the all required synchronization themself.
Cheers!
Only Submit (generated after finishing the builiding of a command buffer) should be (is?) marked as Send
Sounds good!
people who want to use multiple queues or async compute will probably go straight towards the core and take care of the all required synchronization themself
There is a lot of convenience in using render
, and it would hurt to drop all of it once you want to scale an application to multiple queues. At the very least, render
should provide the low-level synchronization even if it's the same as in core
.
There is a lot of convenience in using render, and it would hurt to drop all of it once you want to scale an application to multiple queues. At the very least, render should provide the low-level synchronization even if it's the same as in core.
Fair enough! We can keep it the way it's done in core and try something else if it's too complicated.
Another question: how to manage resources? option 1: smart-pointer like (old gfx api) option 2: big collection tracks resources and release resources in correct ordering, user can access resources with keys (ex: u32 ID)
@davll the pre-ll
code used smart-like pointers. Problem with u32
keys is that we'd never know when the user no longer has any :)
@kvark
yes, smart pointer approach is much safer and more friendly than pure integer IDs. However, I'm worried about its less flexibility and inefficiency for game engine design. The reason is that smart pointers should contain Device references for destructors and therefore cannot be sended (due to Device might not be sendable), leading to inflexible usage. IMO users are responsible to manage resources by themselves as if using entity-component system (where an world only contains entities and user should notify worlds to create or destroy entities). In a nut shell I prefer data oriented approach :P
To clarify: I'm not questioning how gfx_render should be. I'm trying to find out how to manage resources in my 2D rendering engine. :)
@davll there are multiple ways to get smart resource pointers without holding the device:
pre-ll
has: resources are Arc
s, and the device also holds Arc
s to them. When the devices detects no extra references to the stored pointers, it removes the resources.ll
has WIP: resources store a channel that is used by their destructors. Device receives the data and cleans up.where an world only contains entities and user should notify worlds to create or destroy entities
FWIW, Froggy manages components automatically, so smart resource pointers would make total sense there. It's not an ECS though.
Motivation
With the current low-level API (
ll
) rewrite of thecore
a lot of stuff changes towards a vulkan-based API. This requires some changes to the layers above the core, in particularapp
andrender
.app
is a wrapper for writing our examples easier, which takes care of backend specific device initialization and handling of the main loop.render
provides some additional layers and macros for more safety and convenience. Theencoder
wraps the command buffer and extend it with useful implementations.pso
macro is used for easier definition and creation of pipeline state objects.Design
The low level API misses several safety nets compared to the old core (e.g. tracking resources, memory management, ..).
ll
is not supposed to be directly used by most of the users due to it's increased complexity and might easier result in synchronization issues etc. Therefore I would propose to transformrender
into a d3d11-styled layer on top of thell
core, which should solve the following aspects:render
layer. Also, should try to hide image layouts and resource states to avoid pipeline layouts.GeneralQueue
(if compute is supported) or aGraphicsQueue
. This also reduces potential synchronization issues.render
layer.ll
API interferes with the 'legacy' backends.Based on concept above, we need to adjust the device setup a bit as we only want to expose one queue and probably hide some parts of the memory management (e.g explicit heaps). Example API:
Other
render
parts like the PSO macros should stay and it should be able to use them even if the users directly target thell
backend.app
will still remain but the main tasks will be to create the windows for each platform, handle window events and advancing to the next frame.Drawbacks
The mentioned design above limits the possbilities of the users a bit, but otherwise it would be too low-level and require careful handling of synchronization. Probably also hard to implement some aspects like memory management if we want to reach performance of d3d11 drivers for example.