Closed flozano closed 7 years ago
-r--r-----. 1 root wheel 269731 May 16 13:53 netty-buffer-4.1.11.Final.jar
-r--r-----. 1 root wheel 311824 May 16 13:53 netty-codec-4.1.11.Final.jar
-r--r-----. 1 root wheel 367915 May 16 13:52 netty-codec-http2-4.1.11.Final.jar
-r--r-----. 1 root wheel 550549 May 16 13:53 netty-codec-http-4.1.11.Final.jar
-r--r-----. 1 root wheel 119730 May 16 15:12 netty-codec-socks-4.1.11.Final.jar
-r--r-----. 1 root wheel 532341 May 16 13:53 netty-common-4.1.11.Final.jar
-r--r-----. 1 root wheel 353876 May 16 13:53 netty-handler-4.1.11.Final.jar
-r--r-----. 1 root wheel 20743 May 16 15:16 netty-handler-proxy-4.1.11.Final.jar
-r--r-----. 1 root wheel 32233 May 16 13:53 netty-resolver-4.1.11.Final.jar
-r--r-----. 1 root wheel 2393039 Apr 26 15:35 netty-tcnative-boringssl-static-2.0.1.Final.jar
-r--r-----. 1 root wheel 915772 Apr 26 15:35 netty-tcnative-boringssl-static-2.0.1.Final-linux-x86_64.jar
-r--r-----. 1 root wheel 432522 May 16 13:53 netty-transport-4.1.11.Final.jar
-r--r-----. 1 root wheel 102351 May 25 08:00 pushy-0.10.jar
uname -a
Linux myserver.com 2.6.32-573.18.1.el6.x86_64 #1 SMP Tue Feb 9 22:46:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
We have lot of handshake errors, maybe that could be related:
[2017-06-09T04:08:27.309Z] [WARN] [nioEventLoopGroup-2-3] io.netty.handler.ssl.ApplicationProtocolNegotiationHandler:? [id: 0x869dade6, L:/172.31.17.75:43440 - R:api.development.push.apple.com/17.188.138.73:443]
TLS handshake failed:
javax.net.ssl.SSLHandshakeException: error:10000438:SSL routines:OPENSSL_internal:TLSV1_ALERT_INTERNAL_ERROR
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.shutdownWithError(ReferenceCountedOpenSslEngine.java:751)
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.sslReadErrorResult(ReferenceCountedOpenSslEngine.java:959)
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:914)
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:978)
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1021)
at io.netty.handler.ssl.SslHandler$SslEngineType$1.unwrap(SslHandler.java:205)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1156)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1078)
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:745)
[2017-06-09T04:08:27.310Z] [INFO] [nioEventLoopGroup-2-3] com.turo.pushy.apns.ApnsClient:? Failed to connect.
java.lang.IllegalStateException: Channel closed before HTTP/2 preface completed.
at com.turo.pushy.apns.ApnsClient$3.operationComplete(ApnsClient.java:395)
at com.turo.pushy.apns.ApnsClient$3.operationComplete(ApnsClient.java:387)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82)
at io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:1058)
at io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:686)
at io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:664)
at io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:607)
at io.netty.channel.DefaultChannelPipeline$HeadContext.close(DefaultChannelPipeline.java:1276)
at io.netty.channel.AbstractChannelHandlerContext.invokeClose(AbstractChannelHandlerContext.java:624)
at io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:608)
at io.netty.handler.ssl.SslHandler$8.operationComplete(SslHandler.java:1657)
at io.netty.handler.ssl.SslHandler$8.operationComplete(SslHandler.java:1646)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
at io.netty.util.concurrent.DefaultPromise.addListener(DefaultPromise.java:163)
at io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:93)
at io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:28)
at io.netty.handler.ssl.SslHandler.safeClose(SslHandler.java:1646)
at io.netty.handler.ssl.SslHandler.closeOutboundAndChannel(SslHandler.java:1457)
at io.netty.handler.ssl.SslHandler.close(SslHandler.java:613)
at io.netty.channel.AbstractChannelHandlerContext.invokeClose(AbstractChannelHandlerContext.java:624)
at io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:608)
at io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:465)
at io.netty.handler.ssl.ApplicationProtocolNegotiationHandler.handshakeFailure(ApplicationProtocolNegotiationHandler.java:116)
at io.netty.handler.ssl.ApplicationProtocolNegotiationHandler.userEventTriggered(ApplicationProtocolNegotiationHandler.java:94)
at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:329)
at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:315)
at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:307)
at io.netty.handler.ssl.SslUtils.notifyHandshakeFailure(SslUtils.java:246)
at io.netty.handler.ssl.SslHandler.notifyHandshakeFailure(SslHandler.java:1415)
at io.netty.handler.ssl.SslHandler.setHandshakeFailure(SslHandler.java:1406)
at io.netty.handler.ssl.SslHandler.setHandshakeFailure(SslHandler.java:1380)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1088)
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:745)
This sounds a little like #425, but more conveniently, it also it sounds a LOT like this problem for which a fix was released about an hour ago:
These releases update to netty-tcnative 2.0.3.Final which fixes a small native memory leak which happened when you destroy an OpenSSLContext. So if you create a lot and destroy a lot of these its important to update.
I'll put out a new version of Pushy shortly, but in the meantime, I suspect that using the latest version of Netty/netty-tcnative should solve your problem.
OK, thanks for the info! we're going to rebuild/redeploy with the newer netty and see how it goes.
with netty 4.1.12 and tcnative 2.0.3, we reproduced:
io.netty.util.ResourceLeakDetector:? LEAK: ReferenceCountedOpenSslContext.release() was not called before it's garbage-collected. See http://netty.io/wiki/reference-counted-objects.html for more information.
Recent access records: 0
Created at:
io.netty.util.ResourceLeakDetector.track(ResourceLeakDetector.java:237)
io.netty.handler.ssl.ReferenceCountedOpenSslContext.<init>(ReferenceCountedOpenSslContext.java:239)
io.netty.handler.ssl.ReferenceCountedOpenSslContext.<init>(ReferenceCountedOpenSslContext.java:219)
io.netty.handler.ssl.ReferenceCountedOpenSslClientContext.<init>(ReferenceCountedOpenSslClientContext.java:59)
io.netty.handler.ssl.SslContext.newClientContextInternal(SslContext.java:781)
io.netty.handler.ssl.SslContextBuilder.build(SslContextBuilder.java:446)
com.turo.pushy.apns.ApnsClientBuilder.build(ApnsClientBuilder.java:417)
To be clear, is this happening as soon as you build a client?
I'm not sure, as in the same thread and in almost same time, we shutdown the 'old' client and create a new one. This is an edited 'log' from the component that shows the leak alert:
[2017-06-12T06:36:06.429Z] [INFO] [EventProcessing-8-push] com.mycompany.MyAPNSClientCode:? Sending push message to APNS server
[2017-06-12T06:36:06.430Z] [INFO] [EventProcessing-8-push] com.mycompany.MyAPNSClientCode:? Certificate changed (development=true, oldVersion=26176, newVersion=26186)
[2017-06-12T06:36:06.430Z] [INFO] [EventProcessing-8-push] com.mycompany.MyAPNSClientCode:? Releasing APNS client (development=true, version=26176, hadProblems=false)
[2017-06-12T06:36:06.430Z] [INFO] [EventProcessing-8-push] com.mycompany.MyAPNSClientCode:? Shutting down APNS client (development=true)
[2017-06-12T06:36:06.430Z] [INFO] [EventProcessing-8-push] com.turo.pushy.apns.ApnsClient:? Disconnecting.
[2017-06-12T06:36:06.430Z] [INFO] [EventProcessing-8-push] com.mycompany.MyAPNSClientCode:? Initializing APNS client (development=true)
[2017-06-12T06:36:06.432Z] [INFO] [nioEventLoopGroup-2-4] com.mycompany.MyAPNSClientCode:? ApnsClient shutdown properly (development=true)
[2017-06-12T06:36:06.437Z] [INFO] [EventProcessing-8-push] com.turo.pushy.apns.SslUtil:? Native SSL provider is available and supports ALPN; will use native provider.
[2017-06-12T06:36:06.437Z] [ERROR] [EventProcessing-8-push] io.netty.util.ResourceLeakDetector:? LEAK: ReferenceCountedOpenSslContext.release() was not called before it's garbage-collected. Enable advanced leak reporting to find out where the leak occurred. To enable advanced leak reporting, specify the JVM option '-Dio.netty.leakDetection.level=advanced' or call ResourceLeakDetector.setLevel() See http://netty.io/wiki/reference-counted-objects.html for more information.
(context: the certificate configuration is updated, so we close the old client and create a new one).
@flozano I'm struggling to reproduce this in production, and have not been able to find a minimal reproduction case. I wrote a simple test app:
public static void main(final String[] args) throws Exception {
final ApnsClient client = new ApnsClientBuilder()
.setSigningKey(ApnsSigningKey.loadFromPkcs8File(new File("REDACTED"), "REDACTED", "REDACTED"))
.build();
client.connect(ApnsClient.DEVELOPMENT_APNS_HOST).await();
Thread.sleep(5000);
client.disconnect().await();
}
…but that doesn't turn anything up even with paranoid leak detection. How long are clients in service before you rotate them out? Does this happen every time you "retire" a client?
I emphasize that I absolutely believe this is a Real Thing™, but am trying to figure out what you're doing that I'm not. Thanks for your patience!
EDIT: To be clear, the most likely thing you're doing that I'm not doing in this test app is sending a bazillion notifications.
Oh boy! Just kidding! I cranked the log output on unit tests and applied paranoid leak detection there and now have a leak detection volcano. I think it's safe to say I can reproduce the problem.
I think the problem here is very simple: I just misunderstood how/when reference-counted SSL contexts need to be released. This will require more thought during daylight hours, but I'm confident we can fix the problem quickly.
@jorcasso still have this issue in v0.11.3
LEAK: ReferenceCountedOpenSslContext.release() was not called before it's garbage-collected.
at io.netty.util.internal.logging.Slf4JLogger.error(Slf4JLogger.java:176)
at io.netty.util.ResourceLeakDetector.reportUntracedLeak(ResourceLeakDetector.java:318)
at io.netty.util.ResourceLeakDetector.reportLeak(ResourceLeakDetector.java:294)
at io.netty.util.ResourceLeakDetector.track0(ResourceLeakDetector.java:253)
at io.netty.util.ResourceLeakDetector.track(ResourceLeakDetector.java:242)
at io.netty.handler.ssl.ReferenceCountedOpenSslContext.<init>(ReferenceCountedOpenSslContext.java:227)
at io.netty.handler.ssl.ReferenceCountedOpenSslContext.<init>(ReferenceCountedOpenSslContext.java:207)
at io.netty.handler.ssl.ReferenceCountedOpenSslClientContext.<init>(ReferenceCountedOpenSslClientContext.java:59)
We have found this in one of our components which uses latest pushy:
We'll try to enable the advanced leak detection, but anyhow... anyone else has seen this in latest pushy?