janet-lang / janet

A dynamic language and bytecode vm
https://janet-lang.org
MIT License
3.43k stars 221 forks source link

Ability to execute code upon SIGINT and SIGTERM signals. #1262

Closed amano-kenji closed 1 year ago

amano-kenji commented 1 year ago

defer and edefer do not handle SIGINT and SIGTERM.

I'd like to handle these signals in my programs.

bakpakin commented 1 year ago

Related: https://github.com/janet-lang/janet/issues/1030

We don't have support for signal handlers in the core. Given the continued addition of posix only functionality, and the addition of a signal argument to os/proc-kill, this would be a good addition.

sogaiu commented 1 year ago

Perhaps known already, but for reference, there was some related work here: https://github.com/andrewchambers/janet-ctrl-c

bakpakin commented 1 year ago

Pushed to the sigaction branch with an initial implementation of (os/sigaction). This is a generalization of the work from spork/rawterm used for handling SIGWINCH.

Still hashing out the exact interface though, and signals should play well with the rest of Janet (threads, sandbox, ev, busy loops, etc.). I think a binding for sigprocmask/pthread_sigmask may also be warranted.

sogaiu commented 1 year ago

So it looks like there is an example.

Adapting that:

(defn action []
  (print "Handled SIGINT, waiting 5 seconds before...")
  (os/sleep 5)
  (os/exit 1))

(defn main
  [_]
  (os/sigaction :int action true)
  (forever))

Trying it out:

$ ./build/janet ~/scratch/sigaction-sigint.janet 

Followed by Ctrl-C:

^CHandled SIGINT, waiting 5 seconds before...

After a brief wait I got a prompt and then:

$ echo $?
1

@amano-kenji Is it feasible for you to give the code a try to see if it works for the kind of situation you have in mind?

amano-kenji commented 1 year ago

I just tried to use os/sigaction. The interface isn't perfect, but this is good enough.

There is one little problem which is lack of proper documentation.

If (doc os/sigaction) is more properly documented, I think this issue can be closed.

amano-kenji commented 1 year ago

I found one issue.

(defn action []
  (print "Handled SIGINT!")
  (os/exit 1))

(defn main [_]
  # Set the interrupt-interpreter argument to `true` to allow
  # interrupting the busy loop `(forever)`. By default, will not
  # interrupt the interpreter.
  (os/sigaction :int action true)
  (forever))

I can't close this script with Ctrl+c. The script is stuck (forever).

os/sigaction also doesn't respond to :term signal.

sogaiu commented 1 year ago

I didn't have luck with SIGTERM / :term either.

However, the posted sample script (for SIGINT / :int) seems to work here (Ubuntu Linux):

$ cat ak.janet 
(defn action []
  (print "Handled SIGINT!")
  (os/exit 1))

(defn main [_]
  # Set the interrupt-interpreter argument to `true` to allow
  # interrupting the busy loop `(forever)`. By default, will not
  # interrupt the interpreter.
  (os/sigaction :int action true)
  (forever))

$ janet ak.janet 
^CHandled SIGINT!

$ janet -v
1.30.0-ee01045d
amano-kenji commented 1 year ago

I'm on gentoo linux.

primo-ppcg commented 1 year ago

However, the posted sample script (for SIGINT / :int) seems to work here (Linux):

Also works on Debian 12.

sogaiu commented 1 year ago

The script also worked with Void Linux and NetBSD (janet commit: 70a467d4).

bakpakin commented 1 year ago

SIGINT and SIGTERM both work for me. Btw, I don't recommend using the interrupt interpreter example in most cases, instead it is better to change the busy loop to something like (forever (ev/sleep 100)). The interrupt interpreter is sort of a last resort to make sure cleanup code runs (so I guess makes sense for sigint?), but doing anything safely is hard because the current executing fiber can be interrupted at any random instruction instead of the usual places where control is yielded to the event loop

sogaiu commented 1 year ago

SIGINT and SIGTERM both work for me.

I've tried the following for SIGTERM and I haven't had success so far. The script exits and I see no message printed.

(defn action
  []
  (print "Handled SIGTERM!")
  (flush)
  (os/sleep 1000)
  (os/exit 1))

(defn main
  [_]
  (os/sigaction :term action true)
  (forever (ev/sleep 100)))

Does the code look problematic? Or perhaps the behavior is explained by:

The interrupt interpreter is sort of a last resort to make sure cleanup code runs (so I guess makes sense for sigint?), but doing anything safely is hard because the current executing fiber can be interrupted at any random instruction instead of the usual places where control is yielded to the event loop

amano-kenji commented 1 year ago

What is interrupt interpreter?

sogaiu commented 1 year ago

Possibly some hints from these lines of the example:

  # Set the interrupt-interpreter argument to `true` to allow
  # interrupting the busy loop `(forever)`. By default, will not
  # interrupt the interpreter.
amano-kenji commented 1 year ago

According to my experiment, these cannot be interrupted by signal handler.

But, os/sigaction can handle

bakpakin commented 1 year ago

So the interrupt-interpreter option will only interrupt the Janet virtual machine, not arbitrary machine code. Basically, if you have a loop like:

(var a 0)
(forever
  (++ a))

this loop will block events from being handled forever (unless you manually add something like (ev/sleep 0) inside the loop). The interrupt interpreter argument sets a flag that Janet checks on all backwards jumps and function calls that will automatically force it to yield to the event loop to handle events, like a signal. However, this might break some guarantees about your code executing atomically unless you are clever/careful to avoid backwards jumps and function calls in code that needs to be "atomic" in the interpreter.

amano-kenji commented 1 year ago

Python has it. Haskell has it. Raku has it. Why is it dangerous in janet?

Is it still safe to interrupt ev/sleep?

bakpakin commented 1 year ago

It's not "dangerous", it is perfectly memory safe either way. It's just not usually needed and can lead to confusing behavior, and things might not be handled in the order that you expect.

I've added a more complete example that has 4 cases the should help illustrate what is going on here. Python's signals do not interact well with asyncio, and I don't know anything about Raku or Haskell here.

amano-kenji commented 1 year ago

I just read https://github.com/janet-lang/janet/commit/21eab7e9ccbb6b18d7f5ca018f84810ad0431f81

main1 doesn't work when I send it TERM signal. But, it quits upon SIGINT.

It seems only ev/sleep is interruptible by os/sigaction.

bakpakin commented 1 year ago

Works on my machine :tm: . Anyway, that is strange, I'm going to let this feature cook a little more and see if I can make it work better / clean-up the scheduling code to be a bit more straight-forward. If it is not satisfactory I may need to remove the interpreter interrupt functionality but I think it is a useful tool to have.

sogaiu commented 1 year ago

I got the expected results:

So it seems to be working as advertised here.

I used kill -TERM <pid> to test. Is anyone using a different method to test?

I wasn't doing so much earlier in the issue and perhaps that explains why it seemed SIGTERM wasn't working for me (when perhaps it might have).

amano-kenji commented 1 year ago

I just read meson_options.txt which has

option('interpreter_interrupt', type : 'boolean', value : false)

So, interpreter interrupt is disabled by default.

amano-kenji commented 1 year ago

After executing

meson setup build -Dinterpreter_interrupt=true ...
ninja -C build

examples/sigaction.janet works as expected. The following code also works.

(defn handler
  []
  (print "Handled SIGINT")
  (os/exit 1))

(defn main
  [_]
  (os/sigaction :int handler true)
  (forever))

Lesson: Don't mess with meson.

bakpakin commented 1 year ago

Made some improvements to the implementation and fixed the default meson build options. Closing this as now implemented.