[FEATURE] Add option to write a pid file

hron84 commented 9 years ago

It would be good if logstash can write a pid file if needed. Maybe it would be a startup argument.

IngaFeick commented 8 years ago

+1

jordansissel commented 8 years ago

What would you use it for?

On Wednesday, July 8, 2015, Gabor Garami notifications@github.com wrote:

It would be good if logstash can write a pid file if needed. Maybe it would be a startup argument.

— Reply to this email directly or view it on GitHub https://github.com/elastic/logstash/issues/3577.

hron84 commented 8 years ago

Integrate it with Monit, as Monit can track daemons via pid number that read from pid file. Regex matching is also an option, but could be more difficult than a simple pid file.

IngaFeick commented 8 years ago

Same here. Monitoring / restarting.

jordansissel commented 8 years ago

With modern operating systems (Windows, major Linux sisters, Solaris, Illumos, OSX, etc) the service managers (systemd , upstart, smf, launchd, etc) all provide a way to query the process state (alive, pid, etc). I don't really feel we need pid files anymore. I am open to discussion, though.

On Wednesday, December 23, 2015, IngaFeick notifications@github.com wrote:

Same here. Monitoring / restarting.

— Reply to this email directly or view it on GitHub https://github.com/elastic/logstash/issues/3577#issuecomment-166843812.

hron84 commented 8 years ago

With really modern operating systems (Linux, BSD, OS X) you can change how do you start your services and how do you use them. Maybe the shipped starting method is works for you, maybe not. And, if you choose to not use the recommended way just because it does not fits to your requirements or you have a thing that blocks you from using that way, you will be left alone if you still want to track your service status.

Guys, seriously, we talking about writing a single file with only one number to a configurable place at startup. It is not a big-big feature like exposing an XMLRPC interface for queries, just writing a file as a part of the bootstrap process. I am not a full-time developer but I know bits and bolts about the development and this is not a very hard thing to implement (I just do not know Java and LogStash internals enough to make a PR for this).

And for being precise: service managers are not all providing way to query the state of the process and even if they do, they sometimes doing it wrong, provide invalid or old data about the state. It's because not all software written to fits to all service managers requirements and some managers are not adaptive enough. PID files are helping us to validate the process state against the real process list independently from what service managers, bosses or other things say. That's all we want to achieve, nothing more.

IngaFeick commented 8 years ago

We need it for monit. We'll find a way to work around this, but having a pid file would be really nice and the cleanest approach.

mre commented 8 years ago

+1

AlexanderThaller commented 8 years ago

You can't expect that everybody moves to a Linux distribution which already has a modern init system (especially in enterprise environments). So having an easy way to get the PID would make life so much easier.

jordansissel commented 8 years ago

@AlexanderThaller We have an API in Logstash now that could allow this, would that help? I'm not really comfortable adding a pidfile when the best contract we can offer is that "maybe the file contains the correct pid" (because if logstash is destroyed before shutting down, the pidfile will contain a lie)

jordansissel commented 8 years ago

I confess that "write a pidfile" is a super super simple feature and easy to implement. However, It's more complex than it seems, right? I don't like that whenever Logstash isn't running, that pidfile is a lie, and that permission issues would cause it to fail to write the pidfile (and what should Logstash do if that fails?), and if the directory doesn't exist, or if the filesystem is full, or if someone runs two Logstash instances with the same pidfile setting, etc.

If everyone agrees that the concerns in my previous paragraph are not meaningful concerns, then I am happy to add this pidfile option and document the cases in which it will fail or produce lies. At this point, I am willing to implement it, I just don't' want to be on the hook for the known cases where the pidfile concept produces errors or lies. Let me know :)

hron84 commented 8 years ago

@jordansissel Unexpectedly stopped processes leave behind pid files that are a lie, this is how pid files work, everyone knows this issue with them. This is not a problem. All things that uses pid files also ensures the process is exists or not. Process IDs in PID files aren't handled as a statement of the process state, they're more like just a pointer how and where you can find the process of the service. It is super important in cases of Java-based services, because too much Java process can run, and the executable name does not contain any info about the process itself. So pidof, killall and other utilities has no chance to find the process based on the executable name. Imagine a server with Logstash and ElasticSearch on it.

The namespace of process IDs is huge enough to not fear about process ID collision.The chance of other process "steals" the ID of the dead Logstash process is very low on a production system (until someone releases a fork bomb, but that is not a normal case). I hope it helps you to understand the underlying idea.

jordansissel commented 8 years ago

everyone knows this issue with them

I do not believe everyone knows this. We don't really need to debate this, though.

It is super important in cases of Java-based services

The jps tool is quite nice for finding java processes (it only shows java processes).

killall and other utilities

Check out pgrep and its friend pkill - they are lovely :)

The namespace of process IDs is huge enough to not fear about process ID collision

I have experienced issues in production with collisions.

Downplaying the risks isn't something I want to really discuss, because the risks are real and everyone may have different experiences :)

What I'd like to know is -- for folks interested in this feature -- are you accepting the risks that:

Process IDs are reused
pidfiles are left behind after process/machine death and can be lies
Not all scenarios can allow for safe use of pidfiles (file locks via flock(2))
A pidfile might not even be written because of file system issues (permissions, invalid directory, out of space, etc).

If you accept these risks, please do a +1 on this comment using Github's new comment reactions. If there are a few +1s, I'll agree to implement the following (I am open to changing the behavior proposal below, let me know):

Add a new flag --pid-file that lets you specify the location (directory or file) to write a pidfile
The file will contain a single number that is the process id of the main logstash process.
If a pid file cannot be written for any reason, Logstash will log this, but will not consider this a fatal error.
Logstash will not read this file, only write to it.
Logstash will write this file even if the file already exists.
Logstash will try to flock(2) the file before writing it and keep the file open (and the lock).

hron84 commented 8 years ago

The jps tool is quite nice for finding java processes (it only shows java processes).

Let say I have a server with two different LogStash instance (I do not want to debate how good or bad idea it is). How do you differentiate them with just jps? PID files should differentiate them anyway, because another file is another instance (as the config can be different per-instance).

Check out pgrep and its friend pkill - they are lovely :)

I know them. But they cannot make difference between Java processes or LogStash instances.

I have experienced issues in production with collisions.

I experienced too. But not too often.

A pidfile might not even be written because of file system issues (permissions, invalid directory, out of space, etc).

Permissions and directories should be checked at bootstrap, just as log file locations should checked too. Out of space is a danger for log files too, but we want to keep log files feature, right?

If a pid file cannot be written for any reason, Logstash will log this, but will not consider this a fatal error.

This would be super-cool. Most services not even reports if the PID file could not written, they just die silently or keep running silently. This is much more than other dudes do.

Logstash will write this file even if the file already exists.

This is how other services use PID files.

Only one thing: in the shutdown phase, please delete the PID file it is possible, if not, report it to the log file before shutting down the server.

jordansissel commented 8 years ago

I did some digging into the init script we ship with the rpm and deb (https://github.com/elastic/logstash/blob/master/pkg/logstash.sysv). It seems to write a pidfile in /var/run -

This has shipped with Logstash since v2.0.0, I think (judging from git history). So, if you're using our init script, you've already got a pidfile in /var/run. :)

hron84 commented 8 years ago

@jordansissel Two problems:

The init script does not do the cleanup at stop. If you check status() it will always return with 2 (program is dead, but pidfile exists). This is why it is better to handled in the daemon itself instead of working it around in an init script. Because init script will definitelly lie.
As far as I can tell from a quick walkthrough, this init script does not handle multiple instances of Logstash, also not handles if the process is basically not started (because an error) but instead lie a pid to a file blindly.
Bonus: there is a big world other than RPM and DEB based systems. Also take a look at SystemD, where all the distros moving.

This is why I usually do not trust in shipped init scripts. Don't get me wrong, you make an awesome work with Logstash, but I saw a lot of init script like this and I have a strong impressions developers do not work with Linux itself as much as they should. I absolute agree it is not your primary focus.

This init script will make tools like monit crazy, because it is lieing about starting a process, enforcing an "everything is OK" return code, It can bring tools like this to an infinite starting loop, what is hard to debug.

If Logstash itself creates the pid file, the init script would have the following workflow:

Check if old instance running (status())
If yes, exit, if not, remove the stale pidfile if there is any
Start the process
Wait for a moment, 'til the pid file appears. Same waiting loop as in stop() should work
Return with the result of the check.

It would be a simple, reliable workflow instead forking and making blind guessing.

alfmatos commented 7 years ago

Can't get systemd to generate a pid file, so right now monit is going berserk over this. This is one of the biggest problems we've had in the transition to ELK 5.0.

hron84 commented 7 years ago

@alfmatos but monit can handle the whole start-stop routine, without systemd. Not an ideal solution, but works.

alfmatos commented 7 years ago

@hron84 Since logstash uses a gem to generate the start/stop files, if you use /etc/init.d/logstash start/stop it creates a wrapper PID file (even though settings are ignored for thing like pid file location). Using systemd will bypass this, and cause monit to become very confused.

In any case, I still think this should be implemented here, if not to insure compliance with things like monit, then to maintain consistency between logstash and elasticsearch, as elasticsearch relies on these mechanics and creates its own pid files.

frameloss commented 7 years ago

Seems like a pretty standard *nix convention to write a pid file, but OTOH it's easy enough to configure monit to look at the process list ... https://mmonit.com/monit/documentation/#Process

For example:

check process logstash matching "logstash/runner.rb"

hron84 commented 7 years ago

@frameloss correct, but this matches with "vim ./logstash/runner.rb" too which is definitely not we want.

Smithx10 commented 6 years ago

I've been working on making ELK a bit autonomous and it amazes me how much this project's ecosystem goes out of it's way to not implement normal unix behavior. For example, how do you reload your configuration? Normally the answer is SIGHUP. How do you find your PID? Normally the answer is a PID File. :(

jordansissel commented 6 years ago

Normally

"Normally" is highly subjective, so I am unable to address this.

the answer is a PID File

If you are installing Logstash via deb or rpm, this will install a service for your OS. It detects the local init system (upstart, systemd, or sysv).

If your OS uses systemd, then you can find the pid via systemctl status logstash
If your OS uses upstart, then you can find the pid via initctl status logstash
If your OS uses sysv, then you can find the pid in /var/run/logstash.pid.

Hope this helps.

how do you reload your configuration?

We have documented both automatic and signal-based reloads here: https://www.elastic.co/guide/en/logstash/current/reloading-config.html

Smithx10 commented 6 years ago

Thanks for your quick reply @jordansissel. I've tried sending it signals, and Have read the following issue. https://github.com/elastic/logstash/issues/6417.

Does sighup only reload the pipeline configuration? For example, if I update the xpack.monitoring.elasticsearch.url array does that reload?

Does anything in logstash.yml reload for that matter?

I noticed this with filebeat clients, that if I add a node, I have to restart the process, which is fine.

This is a bit of a 'nit' post on my part, but when given an opportunity to pile on, I always do! :P :P

That said, thank you for working on an open source project. <3

nonopoubelle commented 5 years ago

@jordansissel Facing the same problem with monit. I feel like challenging your last reply.

You advocate so well the need for a pidfile. In your argumentation, 4 sentences out of 5 begin by "if", the subjectivity could hardly be higher :-).

Whatever OS I may use, whatever init system I may use or not, however many instances I may launch, a pidfile universally allows me to target precisely the right process. I think this is why it is "normally" a quite objective best practice ;-)

Obviously, YMMV Those where my 2 cents :-D

hron84 commented 5 years ago

@jordansissel

you can find the pid

The only mistake you made is you assume I, as a person want to figure out the PID file. This issue is about I personally don't give a single skunk what is the actual PID of the Logstash process. However, there are monitoring tools that need that PID semi-dynamically and cannot extract it intelligently from various outputs. They need it in its pure form: a single integer number, nothing else. No grepping, no scripting, no parsing. Only integers. We expect Logstash to write this number out reliably when the process is starting, as early as it is possible, no matter what init system, what service manager, or what deus ex machina started the process itself. It does not matter if later it is not managed by the running Logstash instance, not needed to lock it for even a single moment (other than that moment 'til you write out those numbers, of course).

shawnz commented 4 years ago

I do not believe everyone knows this.

It's true maybe not everyone understands the intricacies of the PID file convention, but I think the point they were trying to make is that it's a convention none the less, including all the caveats mentioned which are par-for-the-course when dealing with PID files of any service.

Please add this functionality! Many monitoring systems rely on the PID file convention and don't care about the init system you are using.

I will also note that Elasticsearch already does the "right" thing here and creates the PID file whether or not you use sysvinit.

leeclemens commented 2 years ago

In case it helps anyone stumbling upon this, you can run: RHEL/CentOS 7: systemctl show --property MainPID logstash | sed 's/MainPID=//g' RHEL/CentOS 8: systemctl show --property MainPID --value logstash

lifesboy commented 2 years ago

+1 I'm facing same problem. pid file will be more compatible within an ecosystem where other services are controlled in a consistent way.

elastic / logstash

[FEATURE] Add option to write a pid file #3577