KhronosGroup / Vulkan-Hpp

Open-Source Vulkan C++ API
Apache License 2.0
3.14k stars 308 forks source link

DeviceDispatch - improved dynamic calling of Vulkan functions #803

Closed pc-john closed 3 years ago

pc-john commented 4 years ago

Hello,

I was considering to use dynamic loading while using more Vulkan devices. It seemed to me that vk::Device and vk::DispatchLoaderDynamic could be used for that, but it is not elegant as you have to always think about the Dispatch parameter while VULKAN_HPP_DEFAULT_DISPATCHER_ASSIGNMENT is not of a real help when using more vk::Devices.

I was thinking about a new class named DeviceDispatch (or DeviceDispatcher, or DeviceWithDispatch, or any other suggestion). The idea would be as follows:

class DeviceDispatch {
protected:
   vk::Device _device;
public:
   <all vk::Device methods without the Dispatch parameter>
   <all Vulkan function pointers>
};

All the methods would not have Dispatch parameter as the function pointers would be taken directly from the object. When considering further, I asked myself what about vk::Queue, vk::CommandList, etc. They also have methods with Dispatch parameter. Should there be vk::QueueDispatch and vk::CommandListDispatch? Shall they contain all the function pointers or just pointer to DeviceDispatch? At the end, I did not like any of the two. DeviceDispatch is the one that shall be loaded with function pointers, and I am inclined to put all the methods inside DeviceDispatch, including those serving for Vulkan queue and commandList functionality. Anyway, everybody is free to use vk::Queue and pass DeviceDispatch reference as Dispatch parameter. But surely, other ideas might be put forward here.

So, the final idea would be:

class DeviceDispatch {
protected:
   vk::Device _device;
public:
   <all vk::Device methods without the Dispatch parameter>
   <all vk::Queue+vk::CommandList+... methods without the Dispatch parameter>
   <all Vulkan-core function pointers>
   <all Vulkan-KHR function pointers>
   <all vendor function pointers>
};

The function pointers would be sorted, with core and KHR functions first, to minimize cache misses and optimize cache use.

Alternatively, we can consider deriving DeviceDispatch class from vk::DispatchLoaderDynamic. Or, we might only append methods into vk::DispatchLoaderDynamic that would allow calls through the function pointers that are already there.

Would you be open to consider such API extension for vulkan.hpp?

John

asuessenbach commented 4 years ago

I'm thinking about introducing some RAII-compliant handle classes, like vk::InstanceHandle. They would call a creation function in their constructor and a destroy function in their destructor, allowing seamless usage of std::shared_ptr and std::unique_ptr. Other than the current handle classes, like vk::Instance, they would be allowed (and would need) to hold some data besides the underlying C-type (VkInstance). Such a vk::InstanceHandle would get a mandatory Dispatcher in its constructor, and a vk::DeviceHandle would get an optional Dispatcher. All functions called via those classes would not need to get a Dispatcher anymore but would use the one given to the constructor.

pc-john commented 4 years ago

I took a little time to consider it. From the point of data-oriented design, we have two options:

class DeviceHandle {
protected:
   vk::Device m_device;
   T* m_dispatch;  //< could be (1) T as template parameter or (2) new class DeviceDispatch containing all Device functions
public:
   < all the methods sourcing function pointers through m_dispatch >
}

or another solution to slightly improve the performance and avoid the m_dispatch pointer:

class DeviceHandle {
protected:
   vk::Device m_device;
public:
   < all the methods sourcing function pointers from DeviceHandle >
   < all device function pointers >
}

The actual C++ implementation might use templates, classes or inheritance - as we decide. But underlying data structures would probably take one of these two courses, unless you propose still something else.

If considering both options, the function pointers are either in the separate object (more memory fragmentation, cache misses result in pointer-to-pointer cpu pipeline stalls). Or, we make a single object that is more performance optimized but more heavy on copy and move operations. Alternatively, we might take middle road and design some solution, possibly using templates (?), that would allow the programmer to chose between the two, depending on his needs.

Concerning RAII: I started to like vk::Device and vk::UniqueDevice. The first is really lightweight, the second is perfect for the exception safe code. Unfortunately, vk::UniqueDevice and all vk::Unique classes have a problem for me that their type is not the same but varies on dispatch type. So, programmers are often receiving vk::UniqueHandle\<vk::Device,Dispatch> instead of one immutable type. So, it is difficult to write non-templated interfaces as we might not know Dispatch type in advance. However, this can be probably fixed. We store pointer to all dispatch functions inside vk::UniqueDevice while we use just vkDestroyDevice pointer. If, instead of storing pointer to all dispatch functions, we would store either pointer to vkDestroyDevice member inside dispatch table, or directly the value of vkDestroyDevice, we would not need Dispatch template parameter for UniqueDevice and all vk::Unique classes. We would stay just with templated constructors based on Dispatch type. I believe, this could be implemented while keeping sufficient level of vulkan.hpp API compatibility, hopefully.

And why I opened vk::Device and vk::UniqueDevice topic? Because UniqueDevice is already RAII compilant class and I would rather improve it to be seamlessly usable with std::shared_ptr and std::unique_ptr than to introduce a new class (DeviceHandle). If we really come to the point of need for a new class (DeviceHandle), we should probably have clear arguments what we are missing in UniqueDevice and could not implement there. I am just giving my opinion and possibly opening further considerations and discussion on this point without claiming to be right on everything. To be superoptimal on UniqueDevice class, we might consider if vk::AllocationCallbacks pointer must be stored there in cases when the programmer decides to not use it (over 90% of applications does that, probably).

After considering all above, my inclination is to think about the solution that would use vk::Device or vk::UniqueDevice and allow the programmer to chose between them. Then the question comes how to put all the dispatch stuff there. The non-intrusive solution would be to create template DeviceDispatch\<typename BaseClass>, or any other name. The template parameter BaseClass would be vk::Device or vk::UniqueDevice, giving programmer the opportunity to choose what suits him better. DeviceDispatch would contain all the function pointers and all methods without Dispatch parameter. If we would prefer to have function pointers in external table (like DeviceHandle case 1 at the top of this text), we might have something like DeviceDispatchPointer\<typename BaseClass> template, sourcing the function pointers from external table.

Another approach would be to change vk::Device to class template with one parameter. The default parameter value would produce the same functionality as for now. When dispatch table would be placed as the argument, we would either make it the parent class of vk::Device or its member. In both cases, the function pointer would be sourced from it. Surely, I am open to think about other approaches.

The same can be implemented for vk::Instance and vk::UniqueInstance.

Let me know where you would agree and where you would see the things differently or propose other approaches. John

cdgiessen commented 3 years ago

For reference, I have been working on my own C++ vulkan bindings and have chosen an approach that is very much like the one outlined here.

https://github.com/cdgiessen/vk-module/blob/4869795b56abace0d2384585ec0cdf0740a7662c/cpp20/vk_module.h#L9899

One thing I currently don't do is handle any sort of ownership, either with device or the function pointers. This could be changed in the future, so the discussion here is helpful.

There are physical device, queue, and command buffer function pointer structs, but they only contain the handle and the pointer to the device/instance dispatch object, so its more for convenience than performance.

The thing I like about my approach (and the approach discussed here) is that it is I make it the 'default' way to use the API, so developers are using the 'fast-path' by default and can easily add multi-gpu. From my brief and not comprehensive survey of the ecosystem, other bindings do not cater towards making multi-gpu a first class citizen, which is entirely understandable since it is the very uncommon in practice.

pc-john commented 3 years ago

Hm, interesting. I gave the whole thing much more thoughts since my last post. As Vulkan.hpp should (ideally) provide robust solution for everybody (?), I gathered possible use cases:

  1. static binding - using vk::DispatchLoaderStatic - works perfectly already! All vk::Device, vk::CommandBuffer,... just works!
  2. dynamic binding - using vk::DispatchLoaderDynamic - works well for single device use; the only hassle is that you have to define VULKAN_HPP_DISPATCH_LOADER_DYNAMIC 1 while some people are not in love with project-wide defines and defines in general
  3. multi-device dynamic binding while function pointer table is stored in the same object as vk::Device handle - not supported directly. Essentially, it would be the object like vk::Device, but with the function pointer table included and with all the Vulkan device methods also included, but without dispatch parameter as we are already dispatching through the function pointer table.
  4. multi-device dynamic binding while function pointer table is stored in extra object (apart from vk::Device), vk::Device, vk::CommandBuffer holds just the pointer to the extra object that contains the function pointer table - not supported directly
  5. -8. the same as previous points but for unique* objects that own the handle and destroy it in their destructor It seems to me that it might not be the best idea to develop the direct solution to all the cases and to find eight class names, one for each case. I would rather prefer number of templates while each person would be able to compose easily its own solution according to his wishes. Surely, we can provide types for the most used cases and for backward compatibility. I gave this approach a little of coding time and this would be my first sketch: There will be DeviceFunctionPointers templated class with all device-level function pointers.
    template<typename ParentClass=NoParent>
    class DeviceFunctionPointers : public ParentClass {
    public:
    PFN_doSomething vkDoSomething;
    void init()  { vkDoSomething = doJob; }
    };

    I hope the code is self explaining. NoParent will be empty struct:

    struct NoParent {};

    The second thing we need is all device level methods:

    template<typename ParentClass=NoParent,typename PointerTableGetter=ParentClass>
    class DeviceFunctions : public ParentClass {
    public:
    void doSomething() { PointerTableGetter::get(this)->vkDoSomething(); }
    };

    I put just doSomething() test method, but all device-level methods would go there. The difficulty here is just how to get function pointers out of DeviceFunctionPointers template class. We can solve it by the second template parameter PointerTableGetter that will handle it for us. The simplest PointerTableGetter might be implemented like this:

    struct LocalTableGetter {
    template<typename T> static T* get(T* pointer)  { return pointer; }
    };

    This way, it will return just the local object. The object is expected to contain all the function pointers. This allow us to define the type of one compact Device object with all the methods and function pointers:

    typedef DeviceFunctions<DeviceFunctionPointers<vk::Device>,LocalTableGetter> DeviceWithEverything;

    We will need to develop a good name for it. I just put DeviceWithEverything for now. And testing code:

    // test vk::Device with device methods and with embedded function pointers
    DeviceWithEverything deviceWithEverything;
    deviceWithEverything.init();
    deviceWithEverything.doSomething();

    We just solved use case number 3 given above. How to solve the case 4? In case 4, we want one function table that is shared by vk::Device, vk::CommandBuffer, etc. We need function table pointer:

    template<typename TableType,typename ParentClass=NoParent>
    class FunctionTablePointer : public ParentClass {
    public:
    TableType* funcTablePtr;
    };

    We also need different getter. LocalTableGetter returned this object. Now, we need the content of funcTablePtr (this->funcTablePtr):

    template<typename TableType>
    struct ExternalTableGetter {
    template<typename T>
    static TableType* get(T* pointer)  { return pointer->funcTablePtr; }
    };

    And we are ready to define our type for the device:

    typedef DeviceFunctions<
    FunctionTablePointer<DeviceFunctionPointers<>,vk::Device>,
    ExternalTableGetter<DeviceFunctionPointers<>>
    > ThinDevice;

    Our ThinDevice contains only vk::Device member, funcTablePtr member and all the device methods that are dispatched through external function pointer table. We can create simple testing code:

    DeviceFunctionPointers<> deviceFunctionPointers;
    deviceFunctionPointers.init();
    ThinDevice thinDevice;
    thinDevice.funcTablePtr=&deviceFunctionPointers;
    thinDevice.doSomething();
    thinDevice.waitIdle();

    ThinDevice can be renamed to something more appropriate. Now, we solved use case 4. For the case 1, we just define proper getter:

    template<typename TableType>
    struct StaticDispatchGetter {
    template<typename T> static TableType* get(T*)  { return vk::DispatchLoaderStatic(); }
    };

    Creating the device with static linking can use the type like this one:

    typedef DeviceFunctions<
        vk::Device,
        StaticDispatchGetter<vk::DispatchLoaderStatic>
    > StaticDevice;
    StaticDevice staticDevice;
    staticDevice.waitIdle();

    So, this is the case 1. The case 2 is essentially the same but uses vk::DispatchLoaderDynamic. What remains are just cases with Unique* objects that own the handle and destroy it in destructor (cases 5-8.). I was not quick enough to code the solution for them this weekend so it needs to wait a little (possibly to one of coming weekends). Anyway, I guess that some templates like presented above might help us to develop a more general solution and give users of vulkan.hpp further flexibility to develop things that exactly suit their needs. Currently, vulkan.hpp provides just pre-made classes. To get something not supported in the current vulkan.hpp design means to redevelop your generator that parses vk.xml and generate whatever is required. It means a huge piece of work. These template classes would already provide prefabricated building blocks that the user would just connect in his own way as he wants. Do you think it is a good direction for consideration or would you prefer different direction?

asuessenbach commented 3 years ago

Please have a look at #886. I think, that PR resolves this issue.

pc-john commented 3 years ago

Wow. Quite a big piece of work with many samples! I took a time to go through. It seems that it solves case 3 in the discussion above (and probably also 1 and 2). It does not provide suggested templated approach with its flexibility, but provides probably more readable and nice code.

I am appending my "impressions" as I read through the code. My impressions do not mean that something is wrong, rather that I was considering why things are this way.

I like that you reused short names like Instance, Device, etc., just by putting them to another namespace. This has a downside of having two namespaces. Typing or reading "vk::raii::" is not exciting.... To get rid of it, I was considering of "using namespace vk::raii;" but this does not provide Vulkan structures from vk namespace. And going with both namespaces "using namespace vk; using namespace vk::raii;" spoils the day by name collisions between vk::Device and vk::raii::Device. Maybe, if we put vk:: structures into vk::raii namespace, the problem might be solved. I mean putting them by using, e.g. namespace vk { namespace raii { using vk::CreateDeviceInfo; }}. Thoughts?

Context, Instance and Device carries dispatcher table. Device is ok, but I asked myself, why Context and Instance should mandatorily carry the whole very large table and, even worse, initialize all its members (about 300?) by calling vkGetInstanceProcAddr(). Most applications will use only five (or so) instance functions before initializing device. And from that moment use mostly only device function pointers. If vulkan.hpp boasts with minimal overhead, this is a place where this is put a little in doubt. When considering this point, maybe, Context shall contain only functions that you can call without instance. There are about 5 of them, or only two (?). I was also considering that Context represents Vulkan loader library (vulkan-1.dll, libvulkan.so,..). I was wondering whether vk::raii::Library, or LoaderLibrary, or Loader would not better match with what it represents than Context. Context seems to me rather general name for an entity with unclear meaning in Vulkan. Any name associated with loader library seems more clear to me. Just a thought.

I see that I must make VULKAN_HPP_RAII_DISPATCHER_TYPE project wide define to get dynamic loading of function pointers. Alternatively, I have to make my own header that will make this define before including vulkan_raii.hpp. I am far from love with any of these solutions. My way to solve the problem would be to have two files - vulkan_raii.hpp and vulkan_raii_dl.hpp. The former would do static loading, the later dynamic loading. Surely, I am not the one to decide. All the decisions are on you.

I very appreciate using of std::pair instead of vk::ResultValue as std::tie works better with std::pair. I understand that nice naming of members "result" and "value" is now replaced by first and second, but we cannot have everything probably...

The last point is about the code - whatever class I studied, I looked on the class name, then scrolled down through all member functions without any interest in them at all to reach data members. When I saw data members, it was all clear what the class is all about and what all that 50 member methods are doing. I just asked myself why I have to scroll down each time. Why all the data members are not up just bellow the class name. For these reasons, I always put data members first in each class definition. Surely, my style and my way. Just for consideration. Looking on the data members, I would prefer to use protected instead of private for the sake of better extensibility on user side.

Edit: One more thought concerning vk::raii:: - I do not like so much namespace text. What about to have just one vk namespace and make distinction by included file? It means that vulkan.hpp and vulkan_raii.hpp would both carry all necessary c++ vulkan stuff and they could not be used simultaneously. But the advantage would be to have single and nice namespace vk while a user chooses whether he uses old or new api. Just thought that I would like the most, probably. It would be a kind of a new solution that breaks backward compatibility, but it is a different include file so it is not a problem.

asuessenbach commented 3 years ago

Typing or reading "vk::raii::" is not exciting....

Well, yes... you might want to alias that vk::raii, like so: namespace vkx = vk::raii; And I think, it's reasonable to keep all that stuff in a different namespace, and not create an alias of everything in vk::raii.

Context, Instance and Device carries dispatcher table.

Yep, that is one point on my list I wanted to work on next. No need to hold the complete set of function pointers in either Context or Instance. But that could be changed completely internal with no changes to user-code and thus can be adjusted later on.

Context seems to me rather general name for an entity with unclear meaning in Vulkan.

That's the week point in my change: Context. I had to invent a class name that's not in vulkan, but orchestrates vulkan functions even before any Instance is instanciated. Given that it might be reduced to just those few functions later on, Library doesn't sound appropriate either. That's why I chose that rather general name.

I see that I must make VULKAN_HPP_RAII_DISPATCHER_TYPE project wide define to get dynamic loading of function pointers.

Actually, if you don't explicitly define VULKAN_HPP_RAII_DISPATCHER_TYPE, it's default is VULKAN_HPP_DEFAULT_DISPATCHER_TYPE, whose default in turn is ::VULKAN_HPP_NAMESPACE::DispatchLoaderDynamic. That is, if you don't care, you get the dynamic dispatcher.

For these reasons, I always put data members first in each class definition.

Well, I think, that's more of a personal preference. I prefer to have the public stuff first, then the protected, and finally the private. And due to encapsulation, the data members usually are private in my classes, and probably should be kept private. I think, most C++-editors allow you to directly jump to the end of a scope, so that you don't have to scroll down manually...

What about to have just one vk namespace and make distinction by included file? ...

Interesting idea, indeed! I will think about it. But in any case, the new stuff would need to be in a different namespace, just to ease communication about the classes. After all, what e.g. vk::Device are you talking about, if there would be two completely different ones?

pc-john commented 3 years ago

What about to have just one vk namespace and make distinction by included file? ...

Interesting idea, indeed! I will think about it. But in any case, the new stuff would need to be in a different namespace, just to ease communication about the classes. After all, what e.g. vk::Device are you talking about, if there would be two completely different ones?

My idea was to make, for instance, vk::std and vk::raii namespaces. And depending on which file the user include, either vk::std or vk::raii will be mapped into vk namespace. Just to complement what I already said.

if you don't explicitly define VULKAN_HPP_RAII_DISPATCHER_TYPE, it's default is VULKAN_HPP_DEFAULT_DISPATCHER_TYPE, whose default in turn is ::VULKAN_HPP_NAMESPACE::DispatchLoaderDynamic. That is, if you don't care, you get the dynamic dispatcher.

I see the condition #if defined(VK_NO_PROTOTYPES) before #define VULKAN_HPP_DISPATCH_LOADER_DYNAMIC 1 which in turns allows what you said. But I do not see VK_NO_PROTOTYPES defined anywhere... Did I missed it?

Context seems to me rather general name for an entity with unclear meaning in Vulkan.

That's the week point in my change: Context. I had to invent a class name that's not in vulkan, but orchestrates vulkan functions even before any Instance is instanciated. Given that it might be reduced to just those few functions later on, Library doesn't sound appropriate either. That's why I chose that rather general name.

I mentioned Library and Loader names because Context currently carries vk::DynamicLoader, e.g. handle to the loader library. Also it manages loader's library/instance version through vkEnumerateInstanceVersion, also enumeration of layers and extensions supported by the loader's library/instance. And it shall provide loader's library api to get all Vulkan function pointers. So, I could not resist to suggest these names. Anyway, just my thought. You might have bigger scope for considerations.

asuessenbach commented 3 years ago

But I do not see VK_NO_PROTOTYPES defined anywhere...

VK_NO_PROTOTYPES should be defined to remove, well, the prototypes of the vulkan functions from vulkan_core.h, in order to dynamically load them. That is, that's not part of vulkan.hpp, but of vulkan.h.