Closed fupgang closed 2 years ago
In addition to a configurable socket timeout (thanks for the PR!) you can also configure logbacks AsyncAppender not to block.
We are ok with logback's AsyncAppender discarding TRACE, DEBUG and INFO messages when busy. But we don't want to loose our ERRORs, therefore we still use blocking mode. We increased the default queue size instead.
After reading https://stackoverflow.com/a/1338946 I'm wondering if the PR really helps regarding this issue as the socket timeout only seem to handle blocks for read operations not write operations.
The above stacktrace shows the blocking read operation that happens during the tls handshake. As I understand it, it helps.
This is relevant for GelfTcpTlsAppender only, as I wrote in #63:
Perhaps the socket timeout should be implemented in GelfTcpTlsAppender instead. I'm not sure, if it makes sense for a non-tls appender, since reading the socket appears when performing the tls handshake.
You're right. The PR circumvents this exact problem (from your stack trace). I think it is okay to implement the socket timeout in the TcpAppender class – it a) doesn't hurt and b) allows for an internal change to extend its use case to write operations. And the latter is exactly what I'm thinking about (but failed to express earlier). Maybe I should switch over to NIO in order to have a socket timeout for writer operations as well...
We are using logback-gelf for a while and are pretty happy with it. The following rare error happened only once. It caused our application to freeze while all threads were blocked trying to send a log message.
We are using a GelfTcpTlsAppender. When establishing the connection, the socket is read for the tls handshake. For some reason the tls handshake did never finish successfully. Since no socket timeout is configured, the appender keeps waiting forever:
Of course the network error is out of logback-gelf's scopes and fortunately it seems to be a rare edge case. We would appreciate it, if the GelfTcpTlsAppender's socket timeout can be configured, like it is already possible for the connection timeout.
Thank you!