magnusbaeck / logstash-filter-verifier

Apache License 2.0
192 stars 27 forks source link

Error while accept unix socket while trying the new unix socket option #25

Closed matejzero closed 7 years ago

matejzero commented 7 years ago

Hello,

I'm trying to use the sockets option merged yesterday but I keep getting the following errors:

Use Unix domain sockets.
2017/02/20 10:56:46 Error while accept unix socket: accept unix /tmp/055066706/socket: use of closed network connection
2017/02/20 10:56:46 Error while accept unix socket: accept unix /tmp/150465364/socket: use of closed network connection
2017/02/20 10:56:46 Error while accept unix socket: accept unix /tmp/776328102/socket: use of closed network connection
2017/02/20 10:56:46 Error while accept unix socket: accept unix /tmp/968651720/socket: use of closed network connection
....
Write timeout error

Even if I raise loglevel to DEBUG, I don't get any useful info. I'm not sure if this is a logstash of LFV problem.

I'm running this on CentOS7 (selinux off) and logstash 5.2.0-1.

breml commented 7 years ago

At the moment I can't help on this one, because LFV does not yet support Logstash 5.x (see #8). Therefore I did not test this feature with Logstash 5.x.

matejzero commented 7 years ago

Yea, as you saw in #8, I already opened the thread for Logstash 5.x support and so far, I'm using LFV with Logstash 5.x without a problem.

I did try running logstash with the same parameters that you call in the app and it looks like it's working, at least the client side. For some reason, server sockets don't seem to open (if I understand the error correctly). I will try to debug some more, although I'm not really a go coder, so it's more of a trial and error:)

breml commented 7 years ago

Maybe you could try to run LFV with --logstash-output.

breml commented 7 years ago

OK, I did a quick test with Logstash 5.2.1. The problem is in the https://github.com/logstash-plugins/logstash-input-unix plugin. If you apply the changes from https://github.com/logstash-plugins/logstash-input-unix/pull/18, it does work. I just asked the logstash team to update the gem for this plugin.

breml commented 7 years ago

The new version (3.0.3) of the plugin is out (https://rubygems.org/gems/logstash-input-unix). So now it should work with bin/logstash-plugin update logstash-input-unix (did not test it).

matejzero commented 7 years ago

I can confirm latest LFV is working on Logstash 5.x with logstash-filter-unix updated to 3.0.3. My docker is already happily testing.

Test time dropped from 15m to 1min30s with @breml PRs.

Thank you VERY much both of you.

matejzero commented 7 years ago

I'm running logstash 5.0.1 + unix plugin 3.0.3 and LFV 1.1.1 and this combo works. I also tried updating logstash to 5.4.0 and it also works.

Then I tried upgrading LFV to newer version and it stopped working with the same error we see above:

Use Unix domain sockets.
2017/06/03 17:18:04 Error while accept unix socket: accept unix /tmp/279930180/socket: use of closed network connection
2017/06/03 17:18:04 Error while accept unix socket: accept unix /tmp/522772246/socket: use of closed network connection
2017/06/03 17:18:04 Error while accept unix socket: accept unix /tmp/601451884/socket: use of closed network connection
2017/06/03 17:18:04 Error while accept unix socket: accept unix /tmp/748713656/socket: use of closed network connection

I tried LFV 1.2.0, 1.2.1 and 1.3.0, none of them work. I also tried different versions of logstash (5.0.1, 5.2.1, 5.4.0) with unix filter 3.0.2 and 3.0.3 and none of them work.

What could be the reason?

magnusbaeck commented 7 years ago

LFV has various known incompatibilities with Logstash 5.x so I'm surprised that you're able to get this far. I'm going to be out of town for a couple of days but I can look at it later in the week.

magnusbaeck commented 7 years ago

I just tried LFV's master branch (which is just a couple of commits ahead of 1.3.0) with Logstash 5.4.1 (unpacked tarball) and it worked fine with --sockets:

$ ~/src/logstash-filter-verifier/src/github.com/magnusbaeck/logstash-filter-verifier/logstash-filter-verifier roles/logstash-hub/templates/logstash/tests roles/logstash-hub/templates/logstash/*-filter.conf.j2 --sockets --logstash-path ~/logstash/logstash-5.4.1/bin/logstash
Use Unix domain sockets.
Comparing message 1 of 2 from client-decoration.json...
Comparing message 2 of 2 from client-decoration.json...
Comparing message 1 of 2 from host-decoration.json...
Comparing message 2 of 2 from host-decoration.json...
Comparing message 1 of 2 from http-access.json...
Comparing message 2 of 2 from http-access.json...
Comparing message 1 of 2 from syslog.json...
Comparing message 2 of 2 from syslog.json...
matejzero commented 7 years ago

I'm running inside Docker container, maybe that could be the reason for problems... I will try tomorrow on a bare server just to rule out docker.

magnusbaeck commented 7 years ago

@matejzero, did you ever dig further into this?

matejzero commented 7 years ago

No I didn't. I have some time available today and will give it a try.

Sometimes I get the same error on 1.1.1 as well, but that is maybe in 2-5% of all cases.

matejzero commented 7 years ago

I did some more testing, but nothing concrete yet... Will work on it at home.

matejzero commented 7 years ago

I think I finally found my problem. I was using a fork of LFV 1.1.1 with a few cherry picked PRs and raised hardcoded socket-timeout value. When I upgraded to 1.3.0, it went back to default timeout of 60s (since I wasn't using --socket-timeout) and logstash failed to start in such a short time.

I now raised socket-timeout to 300s and it seems like it works. My Logstash unfortunately takes around 2-3 minutes to start, so I need a high socket-timeout.

magnusbaeck commented 7 years ago

Thanks for confirming!