toland / patron

Ruby HTTP client based on libcurl
http://toland.github.com/patron/
MIT License
541 stars 74 forks source link

Respond to signals when outside the GIL #150

Closed julik closed 6 years ago

julik commented 6 years ago

Currently Patron unlocks the GIL while performing the curl call. This is really good for enabling true parallelism, but Ruby does not trap signals when a Patron call is in progress. Signals also do not get delivered once the curl call completes. This can be problematic if an application performing a long Patron call has to be stopped or restarted - kill -9 is pretty much the only option to deal with this, and it's really not very gentle. At least for the basic set of signals (SIGINT, SIGTERM) Patron should provide a way to bail out of the curl call and yield control back to Ruby. This has to be done either with an unblock function or by installing custom interrupt handlers within the patron call, obviously with care to tear them down later.

FooBarWidget commented 6 years ago

When calling rb_thread_call_without_gvl, its third parameter is rb_unblock_function_t *ubf. This is an "unblock function": Ruby will call this when a signal is sent to the app, and this function is supposed to do whatever necessary to abort the operation that the func function is doing. You passed RUBY_UBF_IO but this is actually a no-op function, because Ruby internally has all sorts of IO cancellation mechanisms that allows it to abort even without an unblock function that does anything.

So what does your unblock function need to do? There are a couple of methods (which I haven't tested, but should work), all of which need changes beyond just writing an unblock function.

First, there is CURLOPT_READFUNCTION to set your own read function. By default libcurl uses the read() system call on a blocking socket. You can set that option to your own function which polls() two file descriptors: the actual socket, as well as a "cancellation pipe". If the poll says that the actual socket is readable, read() from it. If the poll says that the cancellation pipe is readable, return CURL_READFUNC_ABORT to tell libcurl that you want to abort. The unblock function simply closes the cancellation pipe to make it readable. The cancellation pipe is a pipe that you that you create beforehand, before each curl_easy_perform executes. The cancellation pipe pattern is a common pattern in Unix server software to stop an event loop.

The other alternative is to have an actual event loop (maybe using libev or libevent, or possibly one written by yourself) and integrate libcurl into that event loop. Your unblock function then simply cancels the event loop using whatever mechanism is appropriate for your event loop implementation. The mechanism in libcurl that allows event loop integration is libcurl-multi.

julik commented 6 years ago

@FooBarWidget thanks for your explanation! The plot thickens since I found something peculiar. I did indeed experiment with a custom ubf. If I take an implementation like this:

#include <stdio.h>
void session_ubf_abort(void* patron_state) {
  struct patron_curl_state* state = (struct patron_curl_state*) patron_state;
  printf("Aborting in unblock fun\n");
  state->interrupt = INTERRUPT_ABORT; // Next iteration of the progress callback will abort the CURL call if this is set, so there is a delay but it's not significant
}

and run a test case like this:

  it "is able to terminate main thread that is running a slow request" do
    session = Patron::Session.new
    session.timeout = 40
    session.base_url = "http://localhost:9001"
    session.get("/slow")
  end

then I am able to use Ctrl+C to abort. I also do see that the unblock fun gets called (that's why the prinft). The interrupt gets set for libCURL and libCURL then aborts in it's progress callback function - all is nice and dandy. Doing a thread kill also works:

  it "is able to terminate the thread that is running a slow request" do
    t = Thread.new do
      session = Patron::Session.new
      session.timeout = 40
      session.base_url = "http://localhost:9001"
      session.get("/slow")
    end
    sleep 3
    t.kill
  end

I do get feedback on the terminal that the UBF did get triggered. This, however, does not work (and this most closely matches what I am after):

  it "is able to terminate the thread that is running a slow request" do
    Thread.new do
      session = Patron::Session.new
      session.timeout = 40
      session.base_url = "http://localhost:9001"
      session.get("/slow")
    end
    sleep 60 # When sleeping here Ruby is doing useful stuff, but Ctrl+C does nothing
  end

The progress callback function we set for libCURL is getting called still, so that's fine - but Ruby never calls the UBF. So whichever solution I choose from what you have outlined I don't understand how to receive the signal in the first place when I have multiple threads running :-(

FooBarWidget commented 6 years ago

A Ctrl-C only kills the main thread. If you want any other threads to be killed as well then you need to propagate that signal to those threads manually.