Open jdreyesp opened 6 years ago
@jdreyesp from the error NotSslRecordException it smells like curl is sending a non-TLS HTTP message. Can you try changing your scheme in your curl to be https://localhost:8080
?
Hi @mosesn, that's exactly the cause of the issue, if you consume the server through HTTP instead of HTTPS, a big warning log is produced and it cannot be disabled / configured.
I've been investigating and this is managed by ChannelStatsHandler. It seems like by default, it logs everything that is not a IOException/TimeoutException/Failure to WARNING level:
override def exceptionCaught(ctx: ChannelHandlerContext, cause: Throwable): Unit = {
...
case _: IOException => Level.FINE
case _: TimeoutException => Level.FINE
case f: Failure => f.logLevel
case _ => Level.WARNING
...
Is there a way to override the ChannelPipeline (managed by netty4) in the server creation to set a custom ChannelStatHandler to avoid this warning?
Thanks in advance!
@jdreyesp I'm not sure I understand why you want to suppress this warning, which seems useful to me. You can't consume the server without TLS, so I don't know why your clients are trying so hard to do it. Can you elaborate on your use case?
Hi @mosesn , thanks again for your response. We work as a framework team and we're making a non backward compatible change, updating this Finagle server from non secure to secure. Since this server is being consumed by polling tools (Prometheus in our case) we're expecting a big amount of calls during this re-configuration period. That's why we want to be able to disable this warning, in order not to flood out our logs.
I was just wondering if there's an easy way to toggle this warnings off for our server.
Thanks
I see. You could reduce the log level for this specific component until everything migrates over in your logging configuration (like logback, or jul), but I think the better solution might be to serve traffic for both HTTP and HTTPS, and then cut the HTTP port later. This may simplify your migration story too, since clients won't need to figure out how to speak TLS immediately.
Hi @mosesn !
This is not only a migration thing, but in general attackers can try to flood server logs with plain HTTP requests (even if the server is using TLS configuration).
This could create performance issues, for example.
Solving this as disabling finagle logging is not a possibility, since it's logged by com.twitter.finagle and disabling this means disabling the whole finagle logger.
Thanks
@jdreyesp shouldn't it be logged under com.twitter.finagle.netty4.channel.ChannelStatsHandler
? that's the name we create the logger under in ChannelStatsHandler. https://github.com/twitter/finagle/blob/develop/finagle-netty4/src/main/scala/com/twitter/finagle/netty4/channel/ChannelStatsHandler.scala#L16
Hi @mosesn , that was my first try, but it didn't work. Apparently some class in a higher package level is still throwing this warning.
Hi @mosesn, any updates on this? Thanks
@jdreyesp did you find out if another location is also logging this message? I only see it happening from this logger.
One possible gotcha is that the classname for objects has a trailing dollar sign — "com.twitter.finagle.netty4.channel.ChannelStatsHandler$".
Hi!
No, I found it only in the location you mention, but this is also caught in other locations
Indeed, that avoid the ChannelStatsHandler one (thanks :)), but there's two more, netty related:
08:53:35.932 [finagle/netty4-1] WARN com.twitter.finagle - Unhandled exception in connection with /127.0.0.1:58886, shutting down connection | io.netty.handler.codec.DecoderException: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: ....
08:53:35.932 [finagle/netty4-1] WARN i.n.channel.DefaultChannelPipeline - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception. | io.netty.handler.codec.DecoderException: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: ...
I don't want to get rid of com.twitter.finagle
nor DefaultChannelPipeline
since I see them important to log in other cases. But I don't want to get my log overflooded by HTTP calls.
Yeah, I see. And I think your more general point stands — we probably should not be logging at warning level for bad user input.
Fair enough.
Please let me know what is your final decision on this in next Finagle version!
Thanks for the support!
Fwiw, I think we're like some help to fix this as it isn't an issue right now for us. Any chance you'd be interested in starting to work on it? I'm sure we can find you help if you get the ball rolling with a PR.
In addition to making this an info or lower message also consider suppressing the stack trace. It has no added value in this message.
I had a branch which addressed this specific exception a few months back, but decided against shipping it at the time, as I wasn't sure that where I was dealing with the exception was appropriate in all cases. I'll probably take another stab at this as time allows. At the very least we should be able to suppress the stack trace, as @spockz suggests.
I have a pretty bad performance issue that seems to be related to this.
Subject alternative name missing in SSL:
020-06-11 12:41:08,508 WARN [finagle/netty4-2-7] i.n.c.DefaultChannelPipeline [DefaultChannelPipeline.java:1164] An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: General OpenSslEngine problem
Caused by: javax.net.ssl.SSLHandshakeException: General OpenSslEngine problem
Caused by: java.security.cert.CertificateException: No subject alternative DNS name matching foo.bar.no found.
@kevinoliver or @spockz any pointers on where I could take a pass at fixing this? I'd either like to suppress the error, or better yet add a handler for this problem. We see this sort of error a lot as our internal security folks scan for hosts that will accept obsolete cipher suites which causes us problems.
For the original issue I would suggest looking at the log lines in com.twitter.finagle.netty4.channel.ChannelStatsHandler
to suppress stack traces and/or reduce the log level for these known exceptions caused by "client errors". I don't have a concrete list of known exceptions so this might turn into a game of whack-a-mole.
When doing this it might be worthy to verify these exceptions are still counted in metrics though to troubleshoot connection issues.
I don't have a concrete list of known exceptions so this might turn into a game of whack-a-mole.
Thanks, that's very fair.
Is there a way for me to extend the netty pipeline to catch this (or similar) errors? I took a brief look through the docs but didn't see an obvious answer.
Finagle version: 18.6.0
Netty creates a log with level 'warn' when trying to consume a TLS server with HTTP. This can't be disabled and generates a lot of info that can potentially flood application logs in some cases.
Expected behavior
This should be handled by Finagle since the service is configured using TLS, or at least configurable when creating a Finagle server with TLS transport.
Actual behavior
When a TLS server is called using HTTP, netty logs:
Steps to reproduce the behavior
val server = Http.Server().withTransport.tls(sslContext)
server.serve("localhost:8080", httpService)