godotengine / godot-proposals

Godot Improvement Proposals (GIPs)
MIT License
1.14k stars 93 forks source link

Add a way to handle system signals (SIGINT, SIGHUP, …) from a script #1361

Open Kobrar opened 4 years ago

Kobrar commented 4 years ago

Describe the project you are working on: My company has a custom built hardware that we run a hierarchy of Godot applications as a sort of "operating system" on. Its a game machine in simple terms, it has a menu that spawns child game processes.

Describe the problem or limitation you are having in your project: Nothing stops child processes from spawning more processes for whatever reason (for instance). That's a problem when the menu wants to terminate the game. The game won't know it should clean up after itself before quitting.

Describe the feature / enhancement and how it helps to overcome the problem or limitation: Some sort of way to handle SIG*** in GDScript would go a long way to solve all kinds of more complex but perfectly valid use cases like this one. Signal handling isn't at all an exotic idea in software developement. Here for instance, as long as I send the application a signal it can handle internally, I could clean up any and all child processes before the application closes. I have tested on Linux things like kill -1, kill -2, kill -3, kill -15, none of them send notifications to the engine (and it obviously closes).

Describe how your proposal will work, with code, pseudocode, mockups, and/or diagrams: There are a lot of ways this could work. One way could be to have a _system_notification(int SIG_CODE) method in MainLoop class. Something like this would be the most generalized form of the solution. Simpler solution would be to hook up existing handles like _finalize() or _notification() to some of the signals, but that may be less generalized. Given the already implemented functionality, it may have been intended for NOTIFICATION_WM_QUIT_REQUEST to handle kill signals in particular.

If this enhancement will not be used often, can it be worked around with a few lines of script?: There is no way that I know of. On any SIGINT, SIGHUP, SIGTERM signals the game closes with no way to execute code. MainLoop._finalize() doesn't run. Object._notification() doesn't run.

Is there a reason why this should be core and not an add-on in the asset library?: It is a fairly fundamental problem in software engineering. It also seems some of this kind of functionality is already intended to be a part of the core, like NOTIFICATION_WM_QUIT_REQUEST.

Calinou commented 4 years ago

This could be useful for dedicated servers made with Godot, so you can run a safe exit procedure when the user presses Ctrl + C. This can be used to ensure world files are saved and so on.

Is it possible to make this notification work on Windows as well? How well do UNIX signals map to their Windows counterparts?

jonbonazza commented 4 years ago

This could be useful for dedicated servers made with Godot, so you can run a safe exit procedure when the user presses Ctrl + C.

Is it possible to make this notification work on Windows as well? How well do UNIX signals map to their Windows counterparts?

Windows had counterparts to some unix signals, but not all. Interrupt does have a counterpart though.

Dragoncraft89 commented 1 year ago

The problem with signal interrupts is that they cannot do a lot. From man 7 signal-safety:

An  async-signal-safe function is one that can be safely called from within a signal handler.  Many functions are not async-signal-safe.  In par‐
       ticular, nonreentrant functions are generally unsafe to call from a signal handler.

       The kinds of issues that render a function unsafe can be quickly understood when one considers the implementation of the stdio  library,  all  of
       whose functions are not async-signal-safe.

       When  performing  buffered I/O on a file, the stdio functions must maintain a statically allocated data buffer along with associated counters and
       indexes (or pointers) that record the amount of data and the current position in the buffer.  Suppose that the main program is in the middle of a
       call  to  a stdio function such as printf(3) where the buffer and associated variables have been partially updated.  If, at that moment, the pro‐
       gram is interrupted by a signal handler that also calls printf(3), then the second call to printf(3) will operate on inconsistent data, with  un‐
       predictable results.

       To avoid problems with unsafe functions, there are two possible choices:

       1. Ensure  that  (a)  the  signal  handler calls only async-signal-safe functions, and (b) the signal handler itself is reentrant with respect to
          global variables in the main program.

       2. Block signal delivery in the main program when calling functions that are unsafe or operating on global data that is also accessed by the sig‐
          nal handler.

       Generally, the second choice is difficult in programs of any complexity, so the first choice is taken.

To sum up, the engine may be interrupted by a signal handler at any moment, which leads to race conditions even in single threaded code. If you then try to use anything from your main program you possibly work on a intermediate/broken state, which leads to crashes (at best) or hard to trace memory corruptions and weird bugs.

This means that not even calling a gdscript function is safe, no matter what it does. Furthermore, you cannot do anything to the engine code unless you explicitly take care of being interrupted by a signal. This means that I'm not even sure that the engine debugger is working correctly here, as it intercepts SIGINT already (I haven't figured out what for yet). Although it's only assigning to int variables, these read/writes need not be atomic, therefore you'd still get a datarace if these variables are ever read/written from the main program. This is probably fine for the supported architectures, as writing an int should be doable in a single instruction. However I'm not familiar enough with the implementation to see whether any other race conditions beside the data-race are possible.

To work around this limitation, it may be possible to only store which signal arrived (and take care that this is "async-signal-safe" as described above), to later dispatch them in the main loop. However this means that the signal will not be delivered at the correct time, but with a delay. This should be fine though, as the time of interrupt mostly cannot be observed by user code anyway (except in infinite loops).

For the intended use case (cleanup on Ctrl+C/program exit), this should be fine. Note though, that the windows will never send a SIGTERM to your program. However it seems that windows fires a CTRL_CLOSE_EVENT instead, which could be used instead: https://learn.microsoft.com/en-us/windows/console/handlerroutine

However this also fires when the terminal is closed, which may be suprising. On Unix this is typically done by SIGHUP instead. Maybe on Unix, SIGHUP should also generate a SIGTERM event so that the behaviour is similar on both platforms.

SIGINT and SIGTERM also seem like the only "sane" interrupts to support. For the others - SIGFPE / SIGABRT / SIGILL / SIGSEGV - the program state is probably broken in some regard already

This would open some additional implementation problems still: How to store the signals for later dispatching? Using a queue would need to disable interrupts while accessing it (This was described in option (2) of the man page). Ignoring signals invokes a system call, which may be too expensive for the main loop.

This would need some form of lock-free queue to do it efficiently, which I think is not implemented for the codebase yet.

NathanFlurry commented 7 months ago

Needed to handle this for gracefully exiting dedicated game servers. Here's a demo of how to gracefully handle SIGTERM using GDExtension – https://github.com/NathanFlurry/demo-godot-unix-exit