davechallis / ocypod

Ocypod is a Redis-backed service for orchestrating background jobs. Clients/workers can be written in any language, using HTTP/JSON to queue/fetch jobs, store results, etc.
Apache License 2.0
193 stars 13 forks source link

Notifications on new jobs / long poll #19

Open Frando opened 3 years ago

Frando commented 3 years ago

Hi, if I read the code and docs correctly, there's no way at the moment for a worker/client to wait or poll for new jobs. queue/{queue}/job always returns immediately, either with a pop'ed job or with a No content response (in the latter case optionally appyling a per-queue delay setting). This means the worker would do this request in a loop with a delay between requests. What do you think of an (optional) parameter on queue/{queue}/job to instead keep the connection open until a new job arrives? I would guess that from the Redis side this should be possible with BRPOPLPUSH in place of RPOPLPUSH (here).

davechallis commented 3 years ago

Sounds like a good idea, I'll do a bit of digging to see if there would be any issues with it. This wasn't previously possible, since Ocypod used sync Redis connections, but I think would be preferable now.

I'll have a bit of a think on how to implement it, as a blocking pop with a timeout would probably be preferable to the current implementation anyway, so might be able to make it the default in future.

davechallis commented 3 years ago

@Frando I had a quick look into this, and realised the issue (turns out I'd made a note on this before). Blocking Redis commands still block the async redis connection, so calling anything like BLPOP, BRPOPLPUSH etc. blocks the whole executor.

There are some workarounds, but they're probably a lot more complicated (e.g. would involve running the blocking calls in another thread pool to avoid blocking the main async executor).

Another option I'd previously considered was having the server do some looping and retrying when it found no jobs in the queue (to save the overhead of a roundtrip back to the client), with some configurable internal poll interval, and looks a lot like the blocking approach to the client.

We'd still probably want to always keep a timeout on those calls before No Content was returned to the clients though, mostly to keep HTTP connections fresh (very early versions of Ocypod didn't do this, and frequently ran into issues with keeping very long HTTP connections open, due to client/server disconnects, proxy/load balancer timeouts, TCP timeouts in Docker swarm, etc.).

joe-at-startupmedia commented 2 years ago

If I'm not mistaken this is partially solved by next_job_delay which has a default configuration value of 5s. As such polling an empty queue does not return immediately, instead it returns in 5s or whatever configuration value is specified.

https://github.com/davechallis/ocypod/blob/d57e6dd4db5433c4fc45d9d76573bacf86c45c5f/src/config.rs#L132