jondot / sneakers

A fast background processing framework for Ruby and RabbitMQ
https://github.com/jondot/sneakers
MIT License
2.25k stars 332 forks source link

Pidfiles per Sneakers Process #334

Open Tensho opened 6 years ago

Tensho commented 6 years ago

At the moment sneakers relies on serverengine gem to be kind of multiprocess server (wroker_type: process in non-supervisor mode. Let's just make a small experiment to better understand how it works. Consider the next configuration (almost all options you may find in the doc) for serverengine:

ServerEngine.create(nil, AdvancedWorker, {
  daemonize: true,
  log: 'server.log',
  pid_path: 'server.pid',
  server_process_name: 'se-server'
  worker_process_name : "se-worker",
  worker_type : 'process',
  workers : 2,
}).run

After the launch we will get the next processes picture:

$ ps x | grep -v grep | grep -E "(se-server|se-worker)"
19963   ??  S      0:00.10 se-server
19964   ??  S      0:00.04 se-worker
19965   ??  S      0:00.04 se-worker

In the pidfile you will get the PID of the server process 19963, which is expected:

$ cat server.pid
19963

If you kill any worker, it will be reincarnated by server ServerEngine::ProcessManager

$ kill -9 19965
$ ps x | grep -v grep | grep -E "(se-server|se-worker)"
19963   ??  S      0:00.23 se-server
19964   ??  S      0:00.10 se-worker
20344   ??  S      0:00.01 se-worker

As you can see unceremoniously liquidated worker process 19965 was replaced by worker process 20344. We may spectate a rather tenacious system. Everything is good, everything is clear.

But there is one problem here, that raises from the eternal question "Who will monitor monitor?". For example, If I decide to protect only the server process from failures and put it, for example, under monit, that may cause the problem of obsolete workers. Monit will restart new server process with the new set of worker processes and will not care about old worker processes, because there is no any pid information about them. serverengine (and sneakers consequently ) has only 1 pidfile with the PID of the server process. So it will be nice to give the worker processes there own pidfiles, like it does Sidekiq, for example. In such a case it will be possible to clean up obsolete worker processes. But Sidekiq initially was designed to be monitored outside – there is no supervising or healthcheking stuff inside worker processes. For each process Sidekiq has it's own pidfile with index suffix and it's easy to put them under monit.

I'm sorry, because my questions are not primarily addressed to sneakers, but mostly to serverengine. I guess the authors of sneakers had strong arguments to follow with serverengine dependency and know how to handle monitoring generally and described case above particularly. Please share your thoughts 🙇

jondot commented 6 years ago

@Tensho for this specific reason I don't daemonize, I run in foreground, and use a dedicated init system to do this for me. What you're describing is the double-fork + daemonization problem - and yes it's has these exact pitfalls.

Tensho commented 6 years ago

@jondot Thank you for the reply 🙇 If I remember correctly, Unicorn has the same multiprocess architecture (with double fork), but master process notifies workers process on termination (through the pipe), so no orphaned worker processes left. Indeed serverengine doesn't do this trick. If I understood you right, some init systems allow to handle foreground run processes like a daemons. I'm not keen in the foreground vs daemonization topic, so would be happy if you refer me to the some good resource to observate it.

rept commented 4 years ago

I'm not keen in the foreground vs daemonization topic, so would be happy if you refer me to the some good resource to observate it.

Same here.

Tensho commented 4 years ago

@rept "Advanced Programming in the UNIX Environment" by W. Richard Stevens