Appender Intermittently Stops Sending Messages

t0xa / gelfj

Graylog Extended Log Format (GELF) implementation in Java and log4j appender without any dependencies.

https://github.com/t0xa/gelfj/wiki

Other

186 stars 116 forks source link

Appender Intermittently Stops Sending Messages #46

Closed ghost closed 11 years ago

ghost commented 11 years ago

Hello, I'm running the latest code (as of 31 January 2013), and I've noticed that our hosts using this appender will stop sending logs to graylog2 intermittently. Once this occurs, only a restart of the service fixes the problem. This is particularly bad for our production hosts, where a restart is simply not possible most times. I've not been able to track down the root cause, but I will comment on this issue if I get any more details.

t0xa commented 11 years ago

Hi,

Sorry about the issue. Could you please give me a bit more information:

JRE version
Which graylog2 server version you're using?
Aprrox. how much time passes between hangups?
Client platform and target platform?

I'll try to reproduce the issue.

Thanks

ghost commented 11 years ago

$ java -version java version "1.6.0_21" Java(TM) SE Runtime Environment (build 1.6.0_21-b06) Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)

Graylog2 version 0.10.0-rc3

Time between hangups is (estimating here), maybe 10 hours? It seems like it could be related to the graylog2 "stack" breaking. For instance, my elastic search cluster dies a painful death, graylog2 freaks out at the sight of the ES corpse, then when I bring everything back up I don't receive any new log events.

I'm sure what you mean by 'Client platform and target platform' but I'll take a stab at it :-) The applications are running on the aforementioned JRE on Ubuntu 12.04 LTS servers. We generate a high volume of logging events in the application (on the order of 10-50 / second).

Could the Graylog2 server going silent possibly break the gelfj appender? I'm not too keen on the internals of log4j (do automatic restarts happen for broken appenders?).

Thanks for the speedy response!

ghost commented 11 years ago

It's worth noting that once everything has recovered, I can insert messages into graylog from the application hosts (using a quick ruby script for instance).

ghost commented 11 years ago

I'm actually not convinced this is a problem with the gelf updater. I've swapped to a different log4j -> gelf appender and I'm still seeing this same behavior. I have more digging to do... closing this issue.

laghoule commented 10 years ago

influenza, have you found the source of your problem? I have the same issue :/

t0xa commented 10 years ago

Hi, I can't immediately reproduce the issue. My assumption - something fishy with Udp / TCP handling. Is it possible for you to try AMQP transport? Anton On 19 Sep 2013 16:00, "Pascal Gauthier" notifications@github.com wrote:

influenza, have you found the source of your problem? I have the same issue :/

— Reply to this email directly or view it on GitHubhttps://github.com/t0xa/gelfj/issues/46#issuecomment-24736619 .

ghost commented 10 years ago

I no longer work on the project that was using Graylog2 - sorry!