FeedableBodyGenerator - LEAK: ByteBuf.release() was not called before it's garbage-collected.

mielientiev commented 8 years ago

When I use custom FeedableBodyGenerator or SimpleFeedableBodyGenerator I see the next error message:

[error] i.n.u.ResourceLeakDetector - LEAK: ByteBuf.release() was not called before it's garbage-collected. Enable advanced leak reporting to find out where the leak occurred. To enable advanced leak reporting, specify the JVM option '-Dio.netty.leakDetection.level=advanced' or call ResourceLeakDetector.setLevel() See http://netty.io/wiki/reference-counted-objects.html for more information.

I guess the main problem is here: https://github.com/netty/netty/blob/4.0/transport/src/main/java/io/netty/channel/AbstractChannelHandlerContext.java#L990-L992 they just assign null to message, but should also call 'release' (ReferenceCountUtil.release(msg);) Am I right?

mielientiev commented 8 years ago

I get OOM:

play.api.Application$$anon$1: Execution exception[[OutOfMemoryError: Direct buffer memory]] at play.api.Application$class.handleError(Application.scala:296) [play_2.10-2.3.9.jar:2.3.9] at play.api.DefaultApplication.handleError(Application.scala:402) [play_2.10-2.3.9.jar:2.3.9] at play.api.Application$class.handleError(Application.scala:291) [play_2.10-2.3.9.jar:2.3.9] at play.api.DefaultApplication.handleError(Application.scala:402) [play_2.10-2.3.9.jar:2.3.9] at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$14$$anonfun$apply$1.applyOrElse(PlayDefaultUpstreamHandler.scala:205) [play_2.10-2.3.9.jar:2.3.9]

Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.8.0_60] at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) ~[na:1.8.0_60] at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) ~[na:1.8.0_60] at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:645) ~[netty-buffer-4.0.33.Final.jar:4.0.33.Final] at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:229) ~[netty-buffer-4.0.33.Final.jar:4.0.33.Final]

slandelle commented 8 years ago

Please provide a pure AHC reproducer. No Play.

mielientiev commented 8 years ago

Sure. I've written a simple main application: https://gist.github.com/mielientiev/319b2c0d9794ff7cb854

Also I have the next FeedableBodyGenerator: https://gist.github.com/mielientiev/04e672692b0d230f7724 This is just a copy of org.asynchttpclient.request.body.generator.SimpleFeedableBodyGenerator (ConcurrentLinkedQueue was replaced by ArrayBlockingQueue with upper bound to prevent OOM)

Here is a profiling info:

gc1

Full stacktrace:

java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:658) at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:645) at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:229) at io.netty.buffer.PoolArena.allocate(PoolArena.java:205) at io.netty.buffer.PoolArena.allocate(PoolArena.java:133) at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:271) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:155) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:146) at io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:107) at io.netty.channel.AdaptiveRecvByteBufAllocator$HandleImpl.allocate(AdaptiveRecvByteBufAllocator.java:104) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:117) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) at java.lang.Thread.run(Thread.java:745)

slandelle commented 8 years ago

Which version of AHC did you use? Did you check latest 2.0.0-alpha23?

mielientiev commented 8 years ago

I checked the 2.0.0-alpha22 version. I believe that 2.0.0-alpha23 has the same problem.

I tried the 2.0.0-alpha23. And now I get:

java.util.concurrent.TimeoutException: Read timeout to localhost/127.0.0.1:9000 of 60000 ms at org.asynchttpclient.netty.timeout.TimeoutTimerTask.expire(TimeoutTimerTask.java:48) at org.asynchttpclient.netty.timeout.ReadTimeoutTimerTask.run(ReadTimeoutTimerTask.java:57) at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581) at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:655) at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367) at java.lang.Thread.run(Thread.java:745)

Here is a profiler info:

slandelle commented 8 years ago

Could you share your server too, please? It looks to me that you no longer have the ByteBuf leak, and that you're just writing too fast for your server.

mielientiev commented 8 years ago

Here is a simple Play endpoint: https://gist.github.com/mielientiev/38fec15428e8cd0b6713 I use blocking queue (BlockingSimpleFeedableBodyGenerator) to prevent fast upload

slandelle commented 8 years ago

I suspect your timeout issue comes from your blocking queue. You absolutely must never block Netty's IO threads!

mielientiev commented 8 years ago

But in this case I only block the main thread (producer of data). Blocking on the "feed" call

slandelle commented 8 years ago

As I expected, the leak issue is already fixed in alpha23.

Your timeout is that you simply can't flush 10,000,000 * 8kB = 80GB in 2 minutes.

mielientiev commented 8 years ago

But after timeout buffers don't release

mielientiev commented 8 years ago

Here is profiler info after sending a 100,000 * 8kb data with 200 OK response

mielientiev commented 8 years ago

And I get the OOM error at alpha23 too. I just set logging level to WARN and it blows up my console. (added logback.xml)

12:47:24.208 [AsyncHttpClient-2-1] WARN i.n.channel.DefaultChannelPipeline - An exception was thrown by a user handler's exceptionCaught() method while handling the following exception: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.8.0_60] at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) ~[na:1.8.0_60] at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) ~[na:1.8.0_60] at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:645) ~[netty-buffer-4.0.33.Final.jar:4.0.33.Final] at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:229) ~[netty-buffer-4.0.33.Final.jar:4.0.33.Final] at io.netty.buffer.PoolArena.allocate(PoolArena.java:205) ~[netty-buffer-4.0.33.Final.jar:4.0.33.Final] at io.netty.buffer.PoolArena.allocate(PoolArena.java:133) ~[netty-buffer-4.0.33.Final.jar:4.0.33.Final] at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:271) ~[netty-buffer-4.0.33.Final.jar:4.0.33.Final] at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:155) ~[netty-buffer-4.0.33.Final.jar:4.0.33.Final] at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:146) ~[netty-buffer-4.0.33.Final.jar:4.0.33.Final] at io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:107) ~[netty-buffer-4.0.33.Final.jar:4.0.33.Final] at io.netty.channel.AdaptiveRecvByteBufAllocator$HandleImpl.allocate(AdaptiveRecvByteBufAllocator.java:104) ~[netty-transport-4.0.33.Final.jar:4.0.33.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:117) ~[netty-transport-4.0.33.Final.jar:4.0.33.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) [netty-transport-4.0.33.Final.jar:4.0.33.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) [netty-transport-4.0.33.Final.jar:4.0.33.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) [netty-transport-4.0.33.Final.jar:4.0.33.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) [netty-transport-4.0.33.Final.jar:4.0.33.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) [netty-common-4.0.33.Final.jar:4.0.33.Final] at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) [netty-common-4.0.33.Final.jar:4.0.33.Final] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]

Here is why it didn't print anything before https://github.com/netty/netty/blob/4.0/transport/src/main/java/io/netty/channel/AbstractChannelHandlerContext.java#L258

slandelle commented 8 years ago

I wrote a standalone pure Java sample and I really can't reproduce: https://gist.github.com/slandelle/b63f459d51b710fc8d91

mielientiev commented 8 years ago

Strange, I take your code and I have the same problem... here is the project: https://www.sendspace.com/file/6wxbmz

slandelle commented 8 years ago

What's your OS and JDK version?

mielientiev commented 8 years ago

XUbuntu 15.04 and java 1.8.0_60-b27

slandelle commented 8 years ago

Sorry, I really can't reproduce.

mielientiev commented 8 years ago

I took screen video. I don't know how it can help you, but still: https://youtu.be/27-5uz2xf9Y I added log4j dependency + log4.xml. And set 500_000 iteration

mielientiev commented 8 years ago

Here is debug info (may be it can help you to reproduce):

2015-11-14 00:06:58 DEBUG InternalLoggerFactory:71 - Using SLF4J as the default logging framework 2015-11-14 00:06:58 DEBUG ResourceLeakDetector:81 - -Dio.netty.leakDetection.level: simple 2015-11-14 00:06:58 DEBUG ResourceLeakDetector:81 - -Dio.netty.leakDetection.maxRecords: 4 2015-11-14 00:06:58 DEBUG MultithreadEventLoopGroup:76 - -Dio.netty.eventLoopThreads: 8 2015-11-14 00:06:58 DEBUG PlatformDependent0:76 - java.nio.Buffer.address: available 2015-11-14 00:06:58 DEBUG PlatformDependent0:76 - sun.misc.Unsafe.theUnsafe: available 2015-11-14 00:06:58 DEBUG PlatformDependent0:71 - sun.misc.Unsafe.copyMemory: available 2015-11-14 00:06:58 DEBUG PlatformDependent0:76 - java.nio.Bits.unaligned: true 2015-11-14 00:06:58 DEBUG PlatformDependent:76 - Java version: 8 2015-11-14 00:06:58 DEBUG PlatformDependent:76 - -Dio.netty.noUnsafe: false 2015-11-14 00:06:58 DEBUG PlatformDependent:76 - sun.misc.Unsafe: available 2015-11-14 00:06:58 DEBUG PlatformDependent:76 - -Dio.netty.noJavassist: false 2015-11-14 00:06:58 DEBUG PlatformDependent:71 - Javassist: available 2015-11-14 00:06:58 DEBUG PlatformDependent:76 - -Dio.netty.tmpdir: /tmp (java.io.tmpdir) 2015-11-14 00:06:58 DEBUG PlatformDependent:76 - -Dio.netty.bitMode: 64 (sun.arch.data.model) 2015-11-14 00:06:58 DEBUG PlatformDependent:76 - -Dio.netty.noPreferDirect: false 2015-11-14 00:06:58 DEBUG NioEventLoop:76 - -Dio.netty.noKeySetOptimization: false 2015-11-14 00:06:58 DEBUG NioEventLoop:76 - -Dio.netty.selectorAutoRebuildThreshold: 512 2015-11-14 00:06:58 DEBUG ThreadLocalRandom:71 - -Dio.netty.initialSeedUniquifier: 0xc0728fcb1386aeab (took 24 ms) 2015-11-14 00:06:58 DEBUG ByteBufUtil:76 - -Dio.netty.allocator.type: unpooled 2015-11-14 00:06:58 DEBUG ByteBufUtil:76 - -Dio.netty.threadLocalDirectBufferSize: 65536 2015-11-14 00:06:58 DEBUG ByteBufUtil:76 - -Dio.netty.maxThreadLocalCharBufferSize: 16384 2015-11-14 00:06:58 DEBUG NetUtil:86 - Loopback interface: lo (lo, 0:0:0:0:0:0:0:1%lo) 2015-11-14 00:06:58 DEBUG NetUtil:81 - /proc/sys/net/core/somaxconn: 128 2015-11-14 00:06:59 DEBUG JdkSslContext:76 - Default protocols (JDK): [TLSv1.2, TLSv1.1, TLSv1] 2015-11-14 00:06:59 DEBUG JdkSslContext:76 - Default cipher suites (JDK): [TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_128_GCM_SHA256, TLS_RSA_WITH_AES_128_CBC_SHA, SSL_RSA_WITH_3DES_EDE_CBC_SHA] 2015-11-14 00:06:59 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.numHeapArenas: 8 2015-11-14 00:06:59 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.numDirectArenas: 8 2015-11-14 00:06:59 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.pageSize: 8192 2015-11-14 00:06:59 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.maxOrder: 11 2015-11-14 00:06:59 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.chunkSize: 16777216 2015-11-14 00:06:59 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.tinyCacheSize: 512 2015-11-14 00:06:59 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.smallCacheSize: 256 2015-11-14 00:06:59 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.normalCacheSize: 64 2015-11-14 00:06:59 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.maxCachedBufferCapacity: 32768 2015-11-14 00:06:59 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.cacheTrimInterval: 8192 2015-11-14 00:06:59 DEBUG AbstractByteBuf:81 - -Dio.netty.buffer.bytebuf.checkAccessible: true 2015-11-14 00:06:59 DEBUG JavassistTypeParameterMatcherGenerator:76 - Generated: io.netty.util.internal.matchers.io.netty.handler.codec.http.HttpObjectMatcher 2015-11-14 00:06:59 DEBUG NettyConnectListener:66 - Using non-cached Channel [id: 0xc3ae5be3, /127.0.0.1:38385 => localhost/127.0.0.1:9000] for POST '/' 2015-11-14 00:06:59 DEBUG Recycler:76 - -Dio.netty.recycler.maxCapacity.default: 262144 2015-11-14 00:06:59 DEBUG Cleaner0:76 - java.nio.ByteBuffer.cleaner(): available

slandelle commented 8 years ago

Still no issue on my side. And my logs are identical. The only difference is that I run on OSX. I might be able to check on Ubuntu on Monday.

Are you sure you didn't change anything in the test, like trying to buffer the chunks? I ask because in your video, you get a LastHttpContent (meaning that client has finished sending the body) when count reaches 269,000, which looks weird to me as you say that you ran 500,000 iterations.

mielientiev commented 8 years ago

here is the latest version: https://www.sendspace.com/file/w7odpu

mielientiev commented 8 years ago

Do you have any updates about this? I have the same problem on Windows too

slandelle commented 8 years ago

No, I still can't reproduce. Does this happen also when running from the command line?

Try adding maven exec in your pom and run mvn compile exec:java:

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.codehaus.mojo</groupId>
                <artifactId>exec-maven-plugin</artifactId>
                <version>1.4.0</version>
                <configuration>
                    <mainClass>test.Main</mainClass>
                </configuration>
            </plugin>
        </plugins>
    </build>

Please also provide what java -version gives you (I want to make sure IntelliJ is not picking another JDK.

slandelle commented 8 years ago

Wait, I think I finally get it. It don't think that's a memory leak. I think that's a the lack of back-pressure.

slandelle commented 8 years ago

Basically, I think you're BlockingQueue is useless, as Netty will keep on popping from the queue no matter how previous chunks could indeed be flushed to the TCP stack.

mielientiev commented 8 years ago

So, what can I do with this problem?

slandelle commented 8 years ago

Mmmm, actually, I'm starting to consider this is a Netty bug. If I force to use heap buffers instead of native ones, the issue goes away (meaning that I see heap mem being properly GC'ed).

I'm going to switch the Netty's default allocator, instead of using the channel's one that is the direct one.

Then, I'll open an issue against Netty.

slandelle commented 8 years ago

@normanmaurer FYI

slandelle commented 8 years ago

Note: I'm comparing the number of chunked to be written on the client side, and the number of chunked read on the server one, and they match, so the issue is not chunks being buffered on the client side.

slandelle commented 8 years ago

@mielientiev Could you please try latest master?

slandelle commented 8 years ago

Note on previous commit (2a8e14709152e0a2de47f651a19d1ede82aaeefa): AHC still uses direct memory, but unpooled, instead of pooled. Memory leak goes away.

mielientiev commented 8 years ago

I've tested, and it seems to be work fine. Direct memory being properly GC'ed. Also I tested it with Play application and it works too.

Here is profiler info for test project that I've sent (AsyncTestProject) res1

Profiler info for Play App: res2

P.S. Tested on XUbuntu 15.04

slandelle commented 8 years ago

Great, thanks for your feedback.

It will take me some time to come up with a pure Netty sample so I can open an issue against Netty. There's either an issue with Netty 4+ pooled buffers recycling, or with my way of using them.

I'll close this issue once I manage to have pooled buffers properly working. It would be great to have almost no GC there :)

slandelle commented 8 years ago

@mielientiev I haven't exactly figured out what happens, but it has to do with 2 things:

how work/CPU gets distributed between producer and consumer threads. Typically, using an unfair ArrayBlockingQueue causes number of byte buffers to increase.
if you use blocking. Clearly, that's not something you should do when you end up blocking IO threads. The BodyGenerator queue is directly read by the IO thread, and you definitively don't want to block there because the producer is writing. Instead, embrace a non blocking queue there, and control queue size another way.

I also noticed that plugging a monitoring tool such as JMC when the test is running greatly messes up the results.

slandelle commented 8 years ago

See https://github.com/netty/netty/issues/4544

slandelle commented 8 years ago

Closing for now, as the original ByteBuffer leak issue is solved.

AsyncHttpClient / async-http-client

FeedableBodyGenerator - LEAK: ByteBuf.release() was not called before it's garbage-collected. #1029