t0xa / gelfj

Graylog Extended Log Format (GELF) implementation in Java and log4j appender without any dependencies.
https://github.com/t0xa/gelfj/wiki
Other
185 stars 117 forks source link

error in sending GELF messages to logstash #87

Closed swethagaddam closed 9 years ago

swethagaddam commented 9 years ago

Hi, I am using gelfj module in my application log4j.xml file to sending application logs to logstash which are in turn getting indexed in Elasticsearch.

I have observed that at times when my application starts, I see below error in catalina.out. If I restart my application, GELF initializes and i don't see the error. This is happening randomly. Could you help me in figuring out what could be the issue?

We are using gelf-1.1.7.jar and sending logs to remote logstash server.

Error: log4j:ERROR Could not send GELF message

swethagaddam commented 9 years ago

Hi, could anyone respond to the query? I have been facing this issue randomly. I observed that the error occurs when I stop the logstash instance. But, when I start it back the application doesn't seem to resume sending logs.

Thanks Swetha

t0xa commented 9 years ago

Hello Swetha,

Apologies for delay. Could you please share you log4j xml?

I can try and enable more logging around "Could not send GELF message" so we can try and debug. Are you willing to update gelf 1.1.7 to newer version and test that?

A.

swethagaddam commented 9 years ago

Hi Anton,

Here is the Gelf appender related config in log4j.xml

I am fine to update to the new version, but before that can we enable more logging so that we can figure out what is causing the issue randomly?

Thanks Swetha

On Tue, Apr 21, 2015 at 1:44 AM, Anton Yakimov notifications@github.com wrote:

Hello Swetha,

Apologies for delay. Could you please share you log4j xml?

I can try and enable more logging around "Could not send GELF message" so we can try and debug. Are you willing to update gelf 1.1.7 to newer version and test that?

A.

— Reply to this email directly or view it on GitHub https://github.com/t0xa/gelfj/issues/87#issuecomment-94557477.

swethagaddam commented 9 years ago

Hi Anton,

This issue is happening in production and we cannot afford to restart the production applications. Could you help in enabling more logging to figure out the root cause of the issue?

Thanks

t0xa commented 9 years ago

Hi Swetha,

At the moment no logging is available thats why I propsed new version where it will be.

I will be able to publish new version around 20:00 gmt today.

Anton On 23 Apr 2015 13:49, "swethagaddam" notifications@github.com wrote:

Hi Anton,

This issue is happening in production and we cannot afford to restart the production applications. Could you help in enabling more logging to figure out the root cause of the issue?

Thanks

— Reply to this email directly or view it on GitHub https://github.com/t0xa/gelfj/issues/87#issuecomment-95537569.

swethagaddam commented 9 years ago

Thanks Anton. Let me know when you publish the new version so that I can update the jar in my environment to troubleshoot further.

t0xa commented 9 years ago

Hi,

I've published version 1.1.11 to maven central. It should be available shortly.

Please update your pom.xml dependencies:

 <dependency>
    <groupId>org.graylog2</groupId>
    <artifactId>gelfj</artifactId>
    <version>1.1.11</version>
    <scope>compile</scope>
  </dependency>

Try to repeat your test case. This time more info should be available in catalina log.

A.

t0xa commented 9 years ago

Hey Swetha,

Have you tried new version?

You can also can try to switch from udp to tcp by changing graylogHost to {{tcp:log1}}. You will also need to enable source on the server.

<appender name="ELK_APPENDER" class="org.graylog2.log.GelfAppender">
        <param name="graylogHost" value="tcp:log1" />
        <param name="graylogPort" value="5521" />
        <param name="facility" value="${catalina.base}" />
        <param name="extractStacktrace" value="true" />
        <param name="additionalFields"
value="{'customer':'${machine.customer}',
'stability':'${machine.stability}','appname':'${machine.appname}'}" />
        <param name="Threshold" value="INFO" />
        <layout class="org.apache.log4j.PatternLayout">
            <param name="ConversionPattern" value="[ %d [%t] %x
%C{2}.%M():%L %-5p]: %m%n" />
        </layout>
    </appender>
swethagaddam commented 9 years ago

Hi Anton,

I have used the new version of gelfj.jar in some of our applications and it is working fine. I will update you if we face the issue again.

w.r.t tcp, how will the application work if the remote host is not responding? Will it queue the logs or will it crash the application?

Thanks Sweth

t0xa commented 9 years ago

At the moment TCP doesn't retry failed call. This means you will lose log lines that cannot be send. Only AMQP transport resends message configurable amount of time. I would recommend you use AMQP transport for production critical application to ensure log delivery.

swethagaddam commented 9 years ago

Hi Anton,

I tried the new gelfj.jar (v1.1.11) to send the logs to our ELK cluster. After restarting the application, it could send logs to ELK for first 45 min before it died. FYI, I am using UDP protocol to send logs to remote logstash server. Could you help in debugging this further? The error message logged in catalina.out is below:

log4j:ERROR Error during sending GELF message. Error code: -1. java.nio.channels.ClosedChannelException at sun.nio.ch.DatagramChannelImpl.ensureOpen(DatagramChannelImpl.java:335) at sun.nio.ch.DatagramChannelImpl.write(DatagramChannelImpl.java:609) at org.graylog2.GelfUDPSender.sendDatagrams(GelfUDPSender.java:51) at org.graylog2.GelfUDPSender.sendMessage(GelfUDPSender.java:40) at org.graylog2.log.GelfAppender.append(GelfAppender.java:224) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.log(Category.java:856) at org.apache.commons.logging.impl.Log4JLogger.info(Log4JLogger.java:176) at deshaw.moss.service.dao.infra.DatabaseConnectionInterceptor.invoke(DatabaseConnectionInterceptor.java:206) at sun.reflect.GeneratedMethodAccessor251.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483)

t0xa commented 9 years ago

Hi Sweth,

I think I found bug and fixed @ 15842b8b7d0950cbf2d67dbe028fe2e4fb328122. Could you please use v1.1.12 and test?

Anton

swethagaddam commented 9 years ago

Hi Anton,

Thanks for the fix.

I would like to know what caused the issue here. Could you help in understanding the issue?

Thanks Swetha

On Tue, May 5, 2015 at 5:18 PM, Anton Yakimov notifications@github.com wrote:

Hi Sweth,

I think I found bug and fixed @ 15842b8 https://github.com/t0xa/gelfj/commit/15842b8b7d0950cbf2d67dbe028fe2e4fb328122.

Could you please use v1.1.12 and test?

Anton

— Reply to this email directly or view it on GitHub https://github.com/t0xa/gelfj/issues/87#issuecomment-99056533.

t0xa commented 9 years ago

Looking at exception UDP Datagram channel got somehow closed. I assume this happened due to your server restart. The fix is simple, before sending the message it checks if channel is open if not it tries to reopen it.

See test https://github.com/t0xa/gelfj/blob/15842b8b7d0950cbf2d67dbe028fe2e4fb328122/src/test/java/org/graylog2/GelfUDPSenderTest.java

On Tue, May 5, 2015 at 2:59 PM, swethagaddam notifications@github.com wrote:

Hi Anton,

Thanks for the fix.

I would like to know what caused the issue here. Could you help in understanding the issue?

Thanks Swetha

On Tue, May 5, 2015 at 5:18 PM, Anton Yakimov notifications@github.com wrote:

Hi Sweth,

I think I found bug and fixed @ 15842b8 < https://github.com/t0xa/gelfj/commit/15842b8b7d0950cbf2d67dbe028fe2e4fb328122 .

Could you please use v1.1.12 and test?

Anton

— Reply to this email directly or view it on GitHub https://github.com/t0xa/gelfj/issues/87#issuecomment-99056533.

— Reply to this email directly or view it on GitHub https://github.com/t0xa/gelfj/issues/87#issuecomment-99059286.

Vriendelijke groeten, Anton Yakimov

swethagaddam commented 9 years ago

Thanks Anton.

Could you let me know the location of the new jar with the fix? It would be great if you could upload it to any maven repository.

Thanks Swetha On May 5, 2015 5:34 PM, "Anton Yakimov" notifications@github.com wrote:

Looking at exception UDP Datagram channel got somehow closed. I assume this happened due to your server restart. The fix is simple, before sending the message it checks if channel is open if not it tries to reopen it.

See test

https://github.com/t0xa/gelfj/blob/15842b8b7d0950cbf2d67dbe028fe2e4fb328122/src/test/java/org/graylog2/GelfUDPSenderTest.java

On Tue, May 5, 2015 at 2:59 PM, swethagaddam notifications@github.com wrote:

Hi Anton,

Thanks for the fix.

I would like to know what caused the issue here. Could you help in understanding the issue?

Thanks Swetha

On Tue, May 5, 2015 at 5:18 PM, Anton Yakimov notifications@github.com wrote:

Hi Sweth,

I think I found bug and fixed @ 15842b8 <

https://github.com/t0xa/gelfj/commit/15842b8b7d0950cbf2d67dbe028fe2e4fb328122

.

Could you please use v1.1.12 and test?

Anton

— Reply to this email directly or view it on GitHub https://github.com/t0xa/gelfj/issues/87#issuecomment-99056533.

— Reply to this email directly or view it on GitHub https://github.com/t0xa/gelfj/issues/87#issuecomment-99059286.

Vriendelijke groeten, Anton Yakimov

— Reply to this email directly or view it on GitHub https://github.com/t0xa/gelfj/issues/87#issuecomment-99060182.

t0xa commented 9 years ago

Its already done. You need to wait around 1 hr before it is synced to central maven. Version name is 1.1.12. On 5 May 2015 15:12, "swethagaddam" notifications@github.com wrote:

Thanks Anton.

Could you let me know the location of the new jar with the fix? It would be great if you could upload it to any maven repository.

Thanks Swetha On May 5, 2015 5:34 PM, "Anton Yakimov" notifications@github.com wrote:

Looking at exception UDP Datagram channel got somehow closed. I assume this happened due to your server restart. The fix is simple, before sending the message it checks if channel is open if not it tries to reopen it.

See test

https://github.com/t0xa/gelfj/blob/15842b8b7d0950cbf2d67dbe028fe2e4fb328122/src/test/java/org/graylog2/GelfUDPSenderTest.java

On Tue, May 5, 2015 at 2:59 PM, swethagaddam notifications@github.com wrote:

Hi Anton,

Thanks for the fix.

I would like to know what caused the issue here. Could you help in understanding the issue?

Thanks Swetha

On Tue, May 5, 2015 at 5:18 PM, Anton Yakimov < notifications@github.com> wrote:

Hi Sweth,

I think I found bug and fixed @ 15842b8 <

https://github.com/t0xa/gelfj/commit/15842b8b7d0950cbf2d67dbe028fe2e4fb328122

.

Could you please use v1.1.12 and test?

Anton

— Reply to this email directly or view it on GitHub https://github.com/t0xa/gelfj/issues/87#issuecomment-99056533.

— Reply to this email directly or view it on GitHub https://github.com/t0xa/gelfj/issues/87#issuecomment-99059286.

Vriendelijke groeten, Anton Yakimov

— Reply to this email directly or view it on GitHub https://github.com/t0xa/gelfj/issues/87#issuecomment-99060182.

— Reply to this email directly or view it on GitHub https://github.com/t0xa/gelfj/issues/87#issuecomment-99063049.

swethagaddam commented 9 years ago

Thanks a lot Anton. Will let you know how this works!

t0xa commented 9 years ago

Hi. Any updates?