libsdl-org / SDL

Simple Directmedia Layer
https://libsdl.org
zlib License
9.98k stars 1.84k forks source link

SDL return code -> SDL_bool? #10575

Closed slouken closed 2 months ago

slouken commented 2 months ago

I have a terrible idea...

What if we change int SDL_Function(), returning 0 for success and -1 for failure, to SDL_bool SDL_Function(), returning... SDL_TRUE for success and SDL_FALSE for failure?

@libsdl-org/a-team, thoughts?

slouken commented 2 months ago

For context, this would affect 327 functions in the public API.

sezero commented 2 months ago

I have a terrible idea...

It is as you said...

madebr commented 2 months ago

Functions that return a count/number will still need to report failure through a negative value. So there will always be functions that return negative values.

I think negative values are not that bad. By using multiple negative values, you can even return information about the error (but that didn't work out).

slouken commented 2 months ago

This is so good...

diff --git a/test/testaudioinfo.c b/test/testaudioinfo.c
index a32268761..23a56261c 100644
--- a/test/testaudioinfo.c
+++ b/test/testaudioinfo.c
@@ -37,7 +37,7 @@ print_devices(SDL_bool recording)
                 SDL_Log("  %d Error: %s\n", i, SDL_GetError());
             }

-            if (SDL_GetAudioDeviceFormat(devices[i], &spec, &frames) == 0) {
+            if (SDL_GetAudioDeviceFormat(devices[i], &spec, &frames)) {
                 SDL_Log("     Sample Rate: %d\n", spec.freq);
                 SDL_Log("     Channels: %d\n", spec.channels);
                 SDL_Log("     SDL_AudioFormat: %X\n", spec.format);
@@ -94,24 +94,24 @@ int main(int argc, char **argv)
     print_devices(SDL_FALSE);
     print_devices(SDL_TRUE);

-    if (SDL_GetAudioDeviceFormat(SDL_AUDIO_DEVICE_DEFAULT_PLAYBACK, &spec, &frames) < 0) {
-        SDL_Log("Error when calling SDL_GetAudioDeviceFormat(default playback): %s\n", SDL_GetError());
-    } else {
+    if (SDL_GetAudioDeviceFormat(SDL_AUDIO_DEVICE_DEFAULT_PLAYBACK, &spec, &frames)) {
         SDL_Log("Default Playback Device:\n");
         SDL_Log("Sample Rate: %d\n", spec.freq);
         SDL_Log("Channels: %d\n", spec.channels);
         SDL_Log("SDL_AudioFormat: %X\n", spec.format);
         SDL_Log("Buffer Size: %d frames\n", frames);
+    } else {
+        SDL_Log("Error when calling SDL_GetAudioDeviceFormat(default playback): %s\n", SDL_GetError());
     }

-    if (SDL_GetAudioDeviceFormat(SDL_AUDIO_DEVICE_DEFAULT_RECORDING, &spec, &frames) < 0) {
-        SDL_Log("Error when calling SDL_GetAudioDeviceFormat(default recording): %s\n", SDL_GetError());
-    } else {
+    if (SDL_GetAudioDeviceFormat(SDL_AUDIO_DEVICE_DEFAULT_RECORDING, &spec, &frames)) {
         SDL_Log("Default Recording Device:\n");
         SDL_Log("Sample Rate: %d\n", spec.freq);
         SDL_Log("Channels: %d\n", spec.channels);
         SDL_Log("SDL_AudioFormat: %X\n", spec.format);
         SDL_Log("Buffer Size: %d frames\n", frames);
+    } else {
+        SDL_Log("Error when calling SDL_GetAudioDeviceFormat(default recording): %s\n", SDL_GetError());
     }

     SDL_Quit();
diff --git a/test/testsurround.c b/test/testsurround.c
index 7796001e4..23305b9c7 100644
--- a/test/testsurround.c
+++ b/test/testsurround.c
@@ -196,7 +196,7 @@ int main(int argc, char *argv[])

         SDL_Log("Testing audio device: %s\n", devname);

-        if (SDL_GetAudioDeviceFormat(devices[i], &spec, NULL) != 0) {
+        if (!SDL_GetAudioDeviceFormat(devices[i], &spec, NULL)) {
             SDL_LogError(SDL_LOG_CATEGORY_APPLICATION, "SDL_GetAudioDeviceFormat() failed: %s\n", SDL_GetError());
             continue;
         }

I'm going to take a run at this, with the matching sdl2-compat changes in tandem.

madebr commented 2 months ago

While you're working on this, I just tried applying this patch and found some issues worth addressing

--- a/include/SDL3/SDL_stdinc.h
+++ b/include/SDL3/SDL_stdinc.h
@@ -210,7 +210,8 @@ void *alloca(size_t);
  * \sa SDL_TRUE
  * \sa SDL_FALSE
  */
-typedef int SDL_bool;
+#include <stdbool.h>
+typedef bool SDL_bool;

 /**
  * A signed 8-bit integer type.
Semphriss commented 2 months ago

My 2¢: Based on the various SDL-using projects I've encountered in my life, people will use a plethora of ways of checking the return code, including but probably not limited to:

All of those have the same behavior when the only two possible return values are 0 and -1, but they react differently should there be any change to the API, e. g. returning different negative values for errors (Method 2 will fail) or returning positive values. SDL3's documentation seemed clear that it's "0 on success, negative on error", but I remember some SDL2 functions stating "0 on success, -1 on error", and many people implemented it that way.

There's also the somewhat confusing:

if (SDL_Foo()) {
    // Actually, SDL_Foo has _failed_ if it returns a truthy value!
}

... which would be, IMO, improved with the use of a boolean:

if (SDL_Foo() == SDL_FALSE) {
    // SDL_Foo failed
}

// or

if (!SDL_Foo()) {
    // SDL_Foo failed
}
sezero commented 2 months ago

While you're working on this, I just tried applying this patch and found some issues worth addressing

What are the issues?

--- a/include/SDL3/SDL_stdinc.h
+++ b/include/SDL3/SDL_stdinc.h
@@ -210,7 +210,8 @@ void *alloca(size_t);
  * \sa SDL_TRUE
  * \sa SDL_FALSE
  */
-typedef int SDL_bool;
+#include <stdbool.h>
+typedef bool SDL_bool;

 /**
  * A signed 8-bit integer type.

If this is only to experiment, then it's OK. Otherwise we're changing sizeof SDL_bool from 4 to 1: is that something we really want?

madebr commented 2 months ago

While you're working on this, I just tried applying this patch and found some issues worth addressing

What are the issues?

See https://github.com/libsdl-org/SDL/pull/10578

madebr commented 2 months ago

Should we make SDL_bool an unsigned int as part of "SDL return code -> SDL_bool? Then compilers will warn about doing a< 0` comparison against an unsigned number, which will always return false. That's how I found the vulkan issue in #10578.

slouken commented 2 months ago

Should we make SDL_bool an unsigned int as part of "SDL return code -> SDL_bool? Then compilers will warn about doing a< 0` comparison against an unsigned number, which will always return false. That's how I found the vulkan issue in #10578.

No, it need to be compatible with the C boolean expression value, which is defined to be an int.

andreasgrabher commented 2 months ago

I dislike this idea. It will require lots of unnecessary porting efforts.

slouken commented 2 months ago

@libsdl-org/a-team, I'm temporary locking the main branch while I make some wide-spread cleanups. I'll ping here when I'm done.

slouken commented 2 months ago

@libsdl-org/a-team, I'm temporary locking the main branch while I make some wide-spread cleanups. I'll ping here when I'm done.

Cleanups are done, main is unlocked. I'm going to start prototyping this and see how it goes. :)

flibitijibibo commented 2 months ago

For the internal bool/C++ comment changes, was this scripted somehow? We were able to rebase the GPU branch but we still need to do this conversion for GPU specifically.

madebr commented 2 months ago

For the internal bool/C++ comment changes, was this scripted somehow? We were able to rebase the GPU branch but we still need to do this conversion for GPU specifically.

I think so. slouken documented the commands in the commit message: https://github.com/libsdl-org/SDL/commit/6501e90018ab1d58047e5b4382c6bc6d04051191

slouken commented 2 months ago

For the internal bool/C++ comment changes, was this scripted somehow? We were able to rebase the GPU branch but we still need to do this conversion for GPU specifically.

For the bool conversion, I used sed to convert SDL_bool to bool, SDL_TRUE to true, and SDL_FALSE to false, and then went back and corrected the public facing API functions, which need to stay SDL_bool.

Akaricchi commented 2 months ago

I wish SDL returned actually meaningful error codes instead of just -1. This is a step backwards from that.

slouken commented 2 months ago

I wish SDL returned actually meaningful error codes instead of just -1. This is a step backwards from that.

The problem is that you have to choose between generic error codes (EINVAL) and go look at the code to see what that might mean, or every error site has a unique code to tell you what the low level code ran into that might be an issue (which includes all of the Windows error codes, POSIX errno values, etc.)

The SDL error string provides a flexible way of getting error messages that might be meaningful to the application without additional spelunking, and a specific string that can be searched for in the source, if it's not immediately clear.

Akaricchi commented 2 months ago

I wish SDL returned actually meaningful error codes instead of just -1. This is a step backwards from that.

The problem is that you have to choose between generic error codes (EINVAL) and go look at the code to see what that might mean, or every error site has a unique code to tell you what the low level code ran into that might be an issue (which includes all of the Windows error codes, POSIX errno values, etc.)

You can have both generic error codes and more specific ones, maybe even living under the same enum. Just add more as necessary. I believe the vast majority of cases (if not all of them) can be covered with generic codes, and you should document the meaning of the possible return values in the function's doc comment anyway — no need to look at the code to figure out what an error code means. I don't see the problem here.

The SDL error string provides a flexible way of getting error messages that might be meaningful to the application without additional spelunking, and a specific string that can be searched for in the source, if it's not immediately clear.

Those can coexist with error codes. The error string is a human-readable thing to log for debugging, the error code lets the programmer handle distinct error conditions differently.

Here's a good example from the scrapped SDL_FSops PR. The lack of distinct error codes has lead to a misguided design like this:

int SDL_SYS_FSremove(SDL_FSops *fs, const char *fullpath)
{
    int rc = remove(fullpath);
    if (rc < 0) {
        const int origerrno = errno;
        if (origerrno == ENOENT) {
            char *parent = SDL_strdup(fullpath);
            if (!parent) {
                return -1;
            }
            char *ptr = SDL_strrchr(parent, '/');
            if (ptr) {
                *ptr = '\0';  // chop off thing we were removing, see if parent is there.
                struct stat statbuf;
                rc = stat(parent, &statbuf);
                if (rc == 0) {
                    SDL_free(parent);
                    return 0;  // it's already gone, and parent exists, consider it success.
                }
            }
            SDL_free(parent);
        }
        return SDL_SetError("Can't remove path: %s", strerror(origerrno));
    }
    return 0;
}

This code is frankly terrible, and I'm very glad it didn't make it in. It does a lot more than is asked of it, including extraneous system calls and memory allocation, in an attempt to cover up a valid result just to adhere to SDL's naive error reporting style. A bad attempt at that, as this is definitely not an atomic operation. The result is a less efficient, less useful, messier implementation than what it could've been if the function was allowed to return a meaningful error code. The underlying OS API is designed like this for a good reason.

You definitely don't have to distinguish between ALL possible failure conditions; apply some discretion and simplify them down to the cases the clients are likely to care about. Maybe you'll end up with most of the functions returning one generic error code, and that's fine. But not having such a system in place at all is bad design, IMO.

icculus commented 2 months ago

This code is frankly terrible, and I'm very glad it didn't make it in.

Another satisfied customer.

This function is complicated because it's trying to recover after an error code could mean two different things, which is a strong argument against that sort of error reporting, fwiw.

We discussed this at the time that code was written, specifically about filesystem APIs: almost every error returned by SDL is unrecoverable, so there's nothing you can do but show a human-readable message and either go on without whatever, or abort, etc.

The sticking point was that filesystems are different, because sometimes you can correct for some cases--hence the tapdancing in that code--but mostly you're eventually going to have to tell the user "out of disk space," or "permission denied," and they are going to have to correct it elsewhere and try again.

slouken commented 2 months ago

So it turns out that the impact on application code isn't that big, mostly just on initialization, and there's a coccinelle patch in the commit to help people migrate.

Done!

Sackzement commented 2 months ago

I like the simplification of the return value and therefore a cleaner code.

But there are two points, which aren't that great:

I have a suggestion, which would let the new code stay similarly clean and resolve these two points:

Change the return type back to int, make the success value 0 again, BUT specify the failure value to "non-0".

Before SDL_bool change:

 * \returns 0 on success or a negative error code on failure; call
 *          SDL_GetError() for more information.

After SDL_bool change:

 * \returns SDL_TRUE on success or SDL_FALSE on failure; call
 *          SDL_GetError() for more information.

My suggestion 1:

 * \returns 0 on success or any other value on failure; call
 *          SDL_GetError() for more information.

My suggestion 2:

 * \returns 0 on success or non-0 on failure; call
 *          SDL_GetError() for more information.

Here is the pretty code but with int as a return type again: (Only exclamation mark changes)

diff --git a/test/testaudioinfo.c b/test/testaudioinfo.c
index 15b124bfb..e6f9a9941 100644
--- a/test/testaudioinfo.c
+++ b/test/testaudioinfo.c
@@ -37,7 +37,7 @@ print_devices(SDL_bool recording)
                 SDL_Log("  %d Error: %s\n", i, SDL_GetError());
             }

-            if (SDL_GetAudioDeviceFormat(devices[i], &spec, &frames)) {
+            if (!SDL_GetAudioDeviceFormat(devices[i], &spec, &frames)) {
                 SDL_Log("     Sample Rate: %d\n", spec.freq);
                 SDL_Log("     Channels: %d\n", spec.channels);
                 SDL_Log("     SDL_AudioFormat: %X\n", spec.format);
@@ -94,7 +94,7 @@ int main(int argc, char **argv)
     print_devices(SDL_FALSE);
     print_devices(SDL_TRUE);

-    if (SDL_GetAudioDeviceFormat(SDL_AUDIO_DEVICE_DEFAULT_PLAYBACK, &spec, &frames)) {
+    if (!SDL_GetAudioDeviceFormat(SDL_AUDIO_DEVICE_DEFAULT_PLAYBACK, &spec, &frames)) {
         SDL_Log("Default Playback Device:\n");
         SDL_Log("Sample Rate: %d\n", spec.freq);
         SDL_Log("Channels: %d\n", spec.channels);
@@ -104,7 +104,7 @@ int main(int argc, char **argv)
         SDL_Log("Error when calling SDL_GetAudioDeviceFormat(default playback): %s\n", SDL_GetError());
     }

-    if (SDL_GetAudioDeviceFormat(SDL_AUDIO_DEVICE_DEFAULT_RECORDING, &spec, &frames)) {
+    if (!SDL_GetAudioDeviceFormat(SDL_AUDIO_DEVICE_DEFAULT_RECORDING, &spec, &frames)) {
         SDL_Log("Default Recording Device:\n");
         SDL_Log("Sample Rate: %d\n", spec.freq);
         SDL_Log("Channels: %d\n", spec.channels);
diff --git a/test/testsurround.c b/test/testsurround.c
index d0d8ff8a8..bc45bc35d 100644
--- a/test/testsurround.c
+++ b/test/testsurround.c
@@ -196,7 +196,7 @@ int main(int argc, char *argv[])

         SDL_Log("Testing audio device: %s\n", devname);

-        if (!SDL_GetAudioDeviceFormat(devices[i], &spec, NULL)) {
+        if (SDL_GetAudioDeviceFormat(devices[i], &spec, NULL)) {
             SDL_LogError(SDL_LOG_CATEGORY_APPLICATION, "SDL_GetAudioDeviceFormat() failed: %s\n", SDL_GetError());
             continue;
         }

Any thoughts?

slouken commented 2 months ago

This same ambiguity applied to SDL2 functions that returned an int. The whole point of this change was to make code easier to read and understand, and it's much easier for new users to read if (SDL_Function()) { // success.... The functions that return information as a boolean are named in a way that makes sense for boolean logic, e.g. SDL_WindowHasSurface(), SDL_HasRectIntersection(). If there's any doubt, they're clearly documented in the header and wiki.