libsdl-org / SDL

Simple Directmedia Layer
https://libsdl.org
zlib License
9.42k stars 1.75k forks source link

Feature Request: Provide machine readable API definitions with SDL3 #6337

Open ikskuh opened 1 year ago

ikskuh commented 1 year ago

Heya!

I’m the author of SDL.zig, an attempt to create a Zig binding for SDL2.

As auto-translating the headers does not convey enough information about the expected types, a lot of APIs are hand-adjusted to actually fit the intent of the SDL api. One example would be: SDL_Color* colors has to be translated to colors: [*]SDL_Color (pointer to many), and not colors: *SDL_Color (pointer to one).

Now with the beginning of SDL3 development: Is the SDL project open to provide a machine-readable abstract definition of the SDL APIs that allow precise generation of C headers, Zig bindings and possibly other languages (C#, Rust, Nim, …) so there’s only one authorative source for the APIs that convey enough information to satisfy all target languages?

Regards

PS.: I'm willing to spent time and effort on this, also happy to write both the generator and definitions.

madebr commented 5 months ago

My simple python generator is currently very dumb, and requires you to use ctypes to arrange the arguments.

Python code to call SDL_GetDisplays without any wrapping ```python Uint32 = ctypes.c_uint32 SDL_DisplayID = Uint32 # Get a list of currently connected displays. SDL_GetDisplays = SDL_LIBRARY.SDL_GetDisplays SDL_GetDisplays.restype = ctypes.POINTER(SDL_DisplayID) SDL_GetDisplays.argtypes = [ctypes.POINTER(ctypes.c_int32)] display_count = ctypes.c_int32() displays = SDL_GetDisplays(ctypes.pointer(display_count)) for i in range(display_count.value): print(f"display [{i: 2d}] {displays[i]} name=\"{SDL_GetDisplayName(displays[i]).decode()}\"") SDL_free(displays) ```

Are you asking for something similar to Microsoft's SAL? Adding these to the SDL headers would make the headers safer to use, but less readable. SDL_GetDisplays would become:

extern DECLSPEC _Ret_writes_z_(*count) SDL_DisplayID *SDLCALL SDL_GetDisplays(_Out_opt_ int *count);
ikskuh commented 5 months ago

How would you handle something like https://wiki.libsdl.org/SDL3/SDL_GetDisplays? It returns a buffer and a count of elements. I would expect all high quality binding libraries to have friendly overloads of that function.

My solution with apigen would express that like this:

/// Get a list of currently connected displays.
/// Returns a 0 terminated array of display instance IDs which should be freed
/// with `SDL_free`, or `null` on error; call `SDL_GetError` for more details.
fn SDL_GetDisplays(
    /// a pointer filled in with the number of displays returned
    count: *c_int,
) ?[*:null]SDL_DisplayID;

Which models both the information that the result might be null, and also is a pointer to a null-terminated sequence of mutable SDL_DisplayID values, while count is a non-optional pointer to a C ABI integer value.

I can't model allocation information yet, but that might be possible to do something like ?[*:null] dtor(SDL_Free) SDL_DisplayID

I would expect all high quality binding libraries to have friendly overloads of that function.

The type information can then be used to return something like an object with RAII for freeing, length and indexer in C++ for example.

Lucretia commented 5 months ago

Honestly, some sort of json/toml format would be best, where types can be specified with ranges (for strongly typed languages):

interface = "SDL"

[[types]]
  # <typename> = C type
  init_flags = "int"
    bitset = true    # Values are used as bitsets.
    [[values]]
      INIT_TIMER = 0x0000_0001   # Name created by <interface>_<value>
      # ...

[[functions]]
  name = "Init"    # Name created by <interface>_<name>
  return = "int"
  [[parameters]]
    # <name> = <type>
    flags = "init_flags"

Something like this, and have one per interface. Using a machine readable syntax means it can be generated and read easily by any language. In Ada, I separate out SDL.Video and other sub parts like Surfaces and Textures into their own packages, this could also be done but on generating your modules, you can combine if you want. i.e.

interface = "SDL"
subinterface = "Video"
Lucretia commented 5 months ago
init_flags = "int"

But this should not just dump the pointer types, as they can be complex and the last thing we want is to force people to parse this stuff, i.e. int* const *.

Susko3 commented 5 months ago

My simple python generator is currently very dumb, and requires you to use ctypes to arrange the arguments. Python code to call SDL_GetDisplays without any wrapping

Are you asking for something similar to Microsoft's SAL? Adding these to the SDL headers would make the headers safer to use, but less readable. SDL_GetDisplays would become:

extern DECLSPEC _Ret_writes_z_(*count) SDL_DisplayID *SDLCALL SDL_GetDisplays(_Out_opt_ int *count);

It seems SDL is already using bits of SAL. Most of it is limited to SDL_stdinc.h, but SDL_PRINTF_FORMAT_STRING (defined as _Printf_format_string_ on msvc) is also used in other headers.

These annotations get dumped by gendynapi.py, but they end up kinda weird:

    {
        "comment": "",
        "header": "SDL_stdinc.h",
        "name": "SDL_wcslcpy",
        "parameter": [
            "SDL_OUT_Z_CAP (SDL_OUT_Z_CAP(maxlen) wchar_t *dst *REWRITE_NAME)wchar_t *dst",
            "const wchar_t *REWRITE_NAME",
            "size_t REWRITE_NAME"
        ],
        "parameter_name": [
            "param_name_not_specified",
            "src",
            "maxlen"
        ],
        "retval": "size_t"
    },
Lucretia commented 5 months ago

@Susko3 That looks worse than C to parse.

1bsyl commented 5 months ago

@Susko3 yes, gendynapi.py isn't correctly parsing some prototype. I wrote it, and I saw there was this kind of strange macros around a few functions. I didn't know that was Microsoft SAL or similar. Since those dynapi entries where already written, I didn't try to fix the parser or the prototypes.

I suggest that maybe we could just fix the prototype, so that they look standard.

About the readable API definitions:

those are my suggestions, but please double-check with @slouken

Lucretia commented 5 months ago

This SAL stuff just looks like EXTRA COMPLICATIONS. The aim here should be so that the SDL3 C headers can ALSO be generated from* the description and you really want something EASY to parse.

Odex64 commented 1 month ago

As stated by other folks, C does not provide enough information (especially regarding pointers) to make this possible. A possible solution is to annotate functions (see #9907) in order to know more about them, but I'm not sure if it will ever happen - another workaround would be to assert each function that takes pointers and "guess" additional information whether the function executed correctly or not.

slouken commented 1 month ago

I think the most workable plan is to create a separate API definition file (XML? something else?) that lives in src/dynapi that contains more robust annotations for the functions. gendynapi.py could even warn if the file is missing annotations for any new SDL API function.

Lucretia commented 1 month ago

God! Not XML, it's a pain to parse. Just look at Khronos' mess.

Lokathor commented 1 month ago

I've written a parser for the GL and VK xml files more than once. It's really not bad at all. You can write it like once during an afternoon and then it'll just work until the shape of the xml itself updates. The only thing that makes it annoying is that GL and VK stick raw C code into parts of the xml, so anyone not using C has to try harder to interpret that part for their own language. As long as SDL doesn't try to put chunks of raw C into the xml it would be fine.

Lucretia commented 1 month ago

If you have an XML library where you DON'T have to write a state machine parser around it, then fine, it's easy. Otherwise, it's not. Having had to do that for khronos' shit. Yes, I am correct there, they broke the number one rule in language agnostic IDL's, they embedded C macros and headers inside that xml.

Lokathor commented 1 month ago

I just wrote the state machine of the tree into the call stack.

Regardless of those details, we do seem to agree on how an XML version, if made, would need to be done to make it easy to use: no C code embedded in the XML.

Lucretia commented 1 month ago

I would rather it was json, then it can be read in easily as a DB which can be queried easily.

crystalthoughts commented 2 weeks ago

Json is the obvious one i think, godot does it that way for reference. Using the ast output as a base and manually clarifying ambiguous parts is probably the best way to get to v1?