webgpu-native / webgpu-headers

https://webgpu-native.github.io/webgpu-headers/
BSD 3-Clause "New" or "Revised" License
380 stars 43 forks source link

Default descriptor initialization (and discussion of C vs C++ callbacks) #158

Open mrshannon opened 2 years ago

mrshannon commented 2 years ago

Many of the descriptors in WebGPU have default values, and more importantly default values which are not 0 and so a simple memset or zero initialization is not appropriate.

An example of what does not work

WGPUTextureDataLayout TextureDataLayout = {0};
TextureDataLayout.bytesPerRow = 512;

In this case rowsPerImage = 0 instead of WGPU_COPY_STRIDE_UNDEFINED.

Therefore, I propose adding default initialization defines to webgpu.h. While it is possible for each user to create their own, there are some benefits to implementing this in webgpu.h:

One way to implement this is to add defines for each descriptor:

#define WGPUTextureDataLayout_DEFAULT { NULL, 0, WGPU_COPY_STRIDE_UNDEFINED, WGPU_COPY_STRIDE_UNDEFINED }

This would allow users to write:

WGPUTextureDataLayout TextureDataLayout = WGPUTextureDataLayout_DEFAULT;
TextureDataLayout.bytesPerRow = 512;
Kangz commented 2 years ago

This would have helped us a lot in Chromium at various time when removing / deprecating members :) I could change the generator to produce these defines at the bottom of the webgpu.h (guarded with a #ifndef WGPU_SKIP_STRUCT_DEFAULT_MACROS or something).

Other folks WDYT? In particular @cwfitzgerald @kvark @jimb?

Also @mrshannon, you could use the C++ headers generated by Dawn, they have default initialization for the various structs.

grovesNL commented 2 years ago

This sounds reasonable to me. We use defaults for a lot of types in wgpu and find it really useful, e.g. PrimitiveState https://github.com/gfx-rs/wgpu/blob/77b2c25684f971daa977596403f01c71d32ea685/wgpu/examples/msaa-line/main.rs#L65-L69

mrshannon commented 2 years ago

Also @mrshannon, you could use the C++ headers generated by Dawn, they have default initialization for the various structs.

Sort of off topic, but we were initially using the C++ headers but dropped them for the C headers as we kept having to cast everywhere because the callbacks always use the C types and not the C++ types. But that came with it's own issues (WGPUBufferUsage vs WGPUBufferUsageFlags), maybe it's time to consider switching back.

Kangz commented 2 years ago

Any idea on how to improve WGPUBufferUsage vs. WGPUBufferUsageFlags?

For the C++ header typedefs I'm not sure how to fix it because technically the signatures of the functions are different so if we want UBSan to be happy we would need to make an indirection (and allocation).

mrshannon commented 2 years ago

Any idea on how to improve WGPUBufferUsage vs. WGPUBufferUsageFlags?

You technically don't need WGPUBufferUsageFlags as WGPUBufferUsage will always be wide enough for the bitwise combination because we have WGPUBufferUsage_Force32 = 0x7FFFFFFF which is effectively doing what is done for the C++ headers with:

enum class BufferUsage : uint32_t {

For the C++ header typedefs I'm not sure how to fix it because technically the signatures of the functions are different

Correct, I don't think there is either, not without allocating a std::function on the heap and destroying it later (not a good idea). But what could be done is to make it clear (without looking at implementation files) that the C and C++ enums are safe to cast by providing a casting function. In our engine we do all kinds of seemingly "unsafe" casting but it is behind a type-trait like function which always performs the "safe" cast.

I am also not sure it makes sense to have:

using BufferMapCallback = WGPUBufferMapCallback;

Because without that it's at least clear the callbacks will always be using the C API.

But again, this is way off topic (with relation to the initial issue), so perhaps we should continue this somewhere else?

kainino0x commented 2 years ago

You technically don't need WGPUBufferUsageFlags as WGPUBufferUsage will always be wide enough for the bitwise combination because we have WGPUBufferUsage_Force32 = 0x7FFFFFFF which is effectively doing what is done for the C++ headers with:

EDIT: Actually let me split that out into a separate issue. #159

kainino0x commented 2 years ago

I split the flags discussion out to #159, since it's mostly off-topic.

Kangz commented 2 years ago

@mrshannon we could have a wgpu::FromC(...) and wgpu::ToC(...) serie of functions that convert between the C and C++ datatypes. WDYT?

mrshannon commented 2 years ago

@mrshannon we could have a wgpu::FromC(...) and wgpu::ToC(...) serie of functions that convert between the C and C++ datatypes. WDYT?

I think that's a good idea. Makes it clear without looking at the implementation that those casts are safe.

kainino0x commented 2 years ago

For the C++ header typedefs I'm not sure how to fix it because technically the signatures of the functions are different so if we want UBSan to be happy we would need to make an indirection (and allocation).

This is kind of a dumb idea, but what if the C API had two userdata pointers, and the C++ wrapper used one for the C++ callback? I can imagine this being useful for other language bindings, as well. Would look vaguely like:

namespace wgpu {
    typedef void (*RequestDeviceCallback)(RequestDeviceStatus status, Device device,
            char const * message, void * userdata);

    void RequestDeviceWrapper(WGPURequestDeviceStatus status, WGPUDevice device,
            char const * message, void * userdata, void * binding_userdata) {
        RequestDeviceCallback callback = static_cast<RequestDeviceCallback>(binding_userdata);
        callback(static_cast<RequestDeviceStatus>(status), Device{device}, message, userdata);
    }
    void Adapter::RequestDevice(DeviceDescriptor const * descriptor, RequestDeviceCallback callback,
            void * userdata) const {
        wgpuAdapterRequestDevice(Get(), reinterpret_cast<WGPUDeviceDescriptor const * >(descriptor),
                RequestDeviceWrapper, reinterpret_cast<void * >(userdata), callback);
    }
}
mrshannon commented 2 years ago

Here is an example of WebGPU INIT/NULL defines we use for our project, pulled from our WebGPU utility library. It's not quite the same as the defaults in the C++ API provided by Dawn/Emscripten as it's defaults are based on the JS API and includes NULL initialization (which is required for some of the C structs). But it shows some the reasons why upstreaming this is preferred, differences between implementations which we had to paper over with #ifdef.

kainino0x commented 2 years ago

I'm ever so slightly sad that our default structs aren't just all zeroes. I'm just spitballing - apparently this is the thread where I come up with silly C API ideas - but how awful would it be if instead of sentinel 0xFFFF_FFFF values we had a separate field to specify whether you've set the thing or not? (Basically a "type" field for whether it's undefined or int.)

This would also mean implementations must implement the same defaults that are specified in WebIDL, i.e. turning _Undefined values into their defaults, but I kinda think that is a good idea anyway.

mrshannon commented 2 years ago

apparently this is the thread where I come up with silly C API ideas

I am honored to have started it.

separate field to specify whether you've set the thing or not?

One option for this is for optional numbers the type could be changed to something like:

strict OptionalUint32 {
    bool set;
    uint32_t value;
}

This way you could do:

WGPUTextureViewDescriptor desc = {0};
desc.baseArrayLayer = {true, 4};
desc.arrayLayerCount = {true, 1};

It's interesting, and much more self documenting. But it does add quite a bit of complexity when setting optional numbers.

I think the bigger issue is with the nullable structs which are not NULL at the top but are NULL because something inside them is _Undefined. To me that is the bigger issue as it makes it very hard to understand what is going on. I think I would rather see something similar to blend in WGPUColorTargetState where the sub structure is given by pointer, as that is far more self documenting.

kainino0x commented 2 years ago

Something just occurred to me, which is that if we wanted zero initialization to give the default value then we'd need this {set, value} pattern not just for all of the number-or-undefined members, but also for all of the non-enum-type members that have non-zero default values in WebIDL (listed below). That would be kind of painful. The INIT structs are likely a better idea.

beaufortfrancois commented 1 year ago

This is causing some confusion/issues amongst developers. Shall we resume investigation work?

kainino0x commented 1 year ago

I assume you're asking about the C vs C++ callbacks specifically. I'm still working on a rework of the async APIs but the callback signature should be an orthogonal decision. I'll add it to the agenda but we won't meet until 2 weeks from now, so if folks on this thread + @rajveermalviya + @cwfitzgerald can chime in maybe we can agree on either:

kainino0x commented 1 year ago

I'm in favor of the second userdata pointer, described in https://github.com/webgpu-native/webgpu-headers/issues/158#issuecomment-1149227420

kainino0x commented 1 year ago

The malloc/free (new/delete) version would look like this:

namespace wgpu {
namespace {

struct RequestDeviceParameters {
    RequestDeviceCallback callback;
    void * userdata;
};

void RequestDeviceWrapper(WGPURequestDeviceStatus status, WGPUDevice device,
        char const * message, void * userdata) {
    RequestDeviceParameters* p = static_cast<RequestDeviceParameters*>(userdata);
    p->callback(static_cast<RequestDeviceStatus>(status), Device{device}, message, p->userdata);
    delete p;
}

}  // namespace

void Adapter::RequestDevice(DeviceDescriptor const * descriptor, RequestDeviceCallback callback,
        void * userdata) const {
    RequestDeviceParameters* p = new RequestDeviceParameters{callback, userdata};
    wgpuAdapterRequestDevice(Get(), reinterpret_cast<WGPUDeviceDescriptor const * >(descriptor),
            RequestDeviceWrapper, p);
}

}  // namespace wgpu
rajveermalviya commented 1 year ago

Other language bindings would definitely benefit from second userdata pointer, so +1 for that.

kainino0x commented 1 year ago

today's meeting: Dawn to give this some more thought, in particular if we have an API with that accepts a C++ lambda (as a std::function) instead of a userdata pointer, the second userdata most likely doesn't even help. We should prototype that if it's something we want.

kainino0x commented 1 year ago

but no one was particularly opposed to adding the second userdata so if we think it's useful we should just add it.

In particular I'm thinking the C++ API might want to keep both the lambda and raw-function interfaces, since raw-function will be faster.

kainino0x commented 1 year ago

We may also be able to make the raw-function version nicer to use by templating the userdata so it doesn't have to always be void*.

beaufortfrancois commented 1 year ago

It's great to see some activity in this space!

FYI Here's the Dawn bug.

kainino0x commented 1 year ago

I've been sketching out code in this gist: https://gist.github.com/kainino0x/4f04609d91111b360cb4be93822b2f68

and have a new idea:

    typedef void(*WGPUCallbackV)(WGPUSomething, size_t userdataSize, void* userdataPtr);
    void wgpuCallCallbackV(WGPUSomething, WGPUCallbackV, size_t userdataSize, void* userdataPtr);

The "V" in this prototype is for "variable sized userdata". Instead of a single pointer, we take a whole array of raw data of variable size and we'll pass it back to you. This way the allocation is handled by the webgpu.h implementation and it may be able to do it more intelligently than just new/delete.

kainino0x commented 1 year ago

@kangz on the dawn issue:

What about alignment constraints of the copied userdata in this case? It seems a bit overkill to have a variable sized userdata when using the API in C but it could be useful for other bindings and is a more general solution. So IDK :/

Probably we can just provide a large alignment guarantee. alignof(std::max_align_t) (or naive malloc) seems to be the way.

kainino0x commented 1 year ago

meeting: discussed this and it would be best to provide support for larger alignments than max_align_t. It's possible for the C++ bindings to detect an underaligned allocation and reallocate with logic like:

if constexpr (alignof(T) > alignof(max_align_t)) {
    if (reinterpret_cast<uintptr_t>(userdataPtr) % alignof(T) != 0) {
        reallocate
    }
}

And it's also possible to static_assert(alignof(T) <= alignof(max_align_t)). But better just make the C API do it for us. I've updated the gist.

Still, didn't come to a conclusion yet on whether this is better than just making the bindings responsible for the allocation.

kainino0x commented 1 year ago

@cwfitzgerald also raised a question about whether it's even valid to be memcpying a lambda around. I tried to research this and found that it's probably fine if it's is_trivially_copyable, but otherwise I can't find an answer. We'd need to move the object into C's raw memory and then move it back out later, which I don't know how to do in C++. And on top of that I'm not sure whether it's valid to memcpy it around so its address changes between the move-in and move-out.

It's complicated enough to know whether this is even possible that we're leaning toward just abandoning the idea for now.

The double userdata still has some value, for no-allocation entrypoints (see CallCallback2 and CallCallback2_T in the gist).

Kangz commented 1 year ago

It seems it could be possible with placement new and friends. Maybe @amaiorano would know? Here we're trying to memcpy a non-trivially copyable lambda to an allocation managed by the wgpu runtime. But maybe it could be moved?

kainino0x commented 1 year ago

It would only be possible for us to move it in C++ if the C API gave us the space to write into. If we just pass a size and pointer to the C++ object, then C is the one actually moving it. So the question becomes whether you can move the data for a non-trivially-copyable object around without calling its move constructor. It seems likely no.

kainino0x commented 1 year ago

(EDIT: migrated to #201)

kainino0x commented 1 year ago

Moved the callback thing to a new issue #201 so I can label it correctly. Better late than never :)

kainino0x commented 1 year ago

To finally return to the original topic...

webgpu.h meeting: Yes, we should add these macros. They seem better than any other options (like trying to define a static, or calling a function like POSIX).

We would like them to be well-typed to provide decent error messages to users. @rajveermalviya suggested the macro could be like (WGPUAdapterProperties) { 0 }. However I tested this and it turns out not to be allowed in standard C++ 🙁:

<source>:5:13: warning: compound literals are a C99-specific feature [-Wc99-extensions]
    5 |     A a2 = ((A) { 4 }); // Only allowed in non-standard C++

WGPUAdapterProperties { 0 } is invalid in C and only becomes valid in C++ with C++11.

So I think to do this we'd need something like this (full sample):

#if defined(__cplusplus)
#  if __cplusplus >= 201103L
#    define WGPU_STRUCT_INIT_PREFIX(T) T
#  else
#    define WGPU_STRUCT_INIT_PREFIX(T)
#  endif
#elif defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
#  define WGPU_STRUCT_INIT_PREFIX(T) (T)
#else
#  define WGPU_STRUCT_INIT_PREFIX(T)
#endif

typedef struct A { int x; } A;
#define A_INIT WGPU_INIT_PREFIX(A) { /* .x = */ 0 }
kainino0x commented 12 months ago

webgpu.h meeting: we'll do the macro thing, but tweak it so we can wrap parentheses around the typed versions (Just In Case):

#if defined(__cplusplus)
#  if __cplusplus >= 201103L
#    define WGPU_STRUCT_INIT_INCANTATION(type, value) (type value)
#  else
#    define WGPU_STRUCT_INIT_INCANTATION(type, value) value
#  endif
#elif defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
#  define WGPU_STRUCT_INIT_INCANTATION(type, value) ((type) value)
#else
#  define WGPU_STRUCT_INIT_INCANTATION(type, value) value
#endif

typedef struct A { int x; } A;
#define A_INIT WGPU_STRUCT_INIT_INCANTATION(A, { /* .x = */ 0 })

Strawperson names. Full sample

kainino0x commented 10 months ago

Dec 21 meeting