redis / lettuce

Advanced Java Redis client for thread-safe sync, async, and reactive usage. Supports Cluster, Sentinel, Pipelining, and codecs.
https://lettuce.io
MIT License
5.36k stars 960 forks source link

Getting java.lang.OutOfMemoryError: Direct buffer memory #705

Open vleushin opened 6 years ago

vleushin commented 6 years ago

Lettuce version: 5.0.2.RELEASE Reproducible in Linux (Kubernetes), Windows (my local machine), likely everywhere

I've started testing Redis cluster in Kubernetes. So far -- not bad, all failover scenarios worked fine, but there was one big problem -- memory leaks. It was not evident for me at first (because it was direct memory leak and I was looking at heap charts), but I think I tackled two cases.

First one is topology refresh. I have single node redis cluster (redis-cluster) in docker compose for local testing. With this options:

ClusterTopologyRefreshOptions.builder()
    .enablePeriodicRefresh(Duration.ofSeconds(2)) // anything will do, but small value will lead to exception faster
    .enableAllAdaptiveRefreshTriggers()
    .build()

And small direct memory size, e.g. -XX:MaxDirectMemorySize=100M or 200M, I can get OOM exception in 1-2 minutes. Exception looks like this:

2018-02-17 13:33:17.243 [WARN] [lettuce-eventExecutorLoop-63-8] [i.l.c.c.t.ClusterTopologyRefresh] - Cannot retrieve partition view from RedisURI [host='redis-cluster', port=7000], error: java.util.concurrent.ExecutionException: java.lang.NullPointerException
2018-02-17 13:33:17.243 [WARN] [lettuce-nioEventLoop-65-3] [i.l.c.p.CommandHandler] - null Unexpected exception during request: java.lang.NullPointerException
java.lang.NullPointerException: null
    at io.lettuce.core.protocol.CommandHandler.channelRead(CommandHandler.java:500) ~[lettuce-core-5.0.2.RELEASE.jar:?]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1414) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:945) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:141) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:886) [netty-common-4.1.21.Final.jar:4.1.21.Final]
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.21.Final.jar:4.1.21.Final]
    at java.lang.Thread.run(Thread.java:844) [?:?]

Looks like Netty is out of direct memory.

Second case I'm less sure, I did not do extensive testing, but I think they are connected. I have 7 redis node cluster in our Kubernetes environment. We killed one master to see if it will failover. It did, topology refreshed, everything seemed OK. But in background Lettuce kept pinging/trying to connect to dead node (only seen when turned on Lettuce debug log), and direct memory quickly dried up and node died.

Any thoughts?

vleushin commented 6 years ago

To clarify:

mp911de commented 6 years ago

Can you please provide the stack trace showing the OOM. Getting a heap dump with heap buffers would be ideal. You can set -Dio.netty.noPreferDirect=true to prefer heap buffers and adjust the memory size with -Xmx=… to limit the heap size through JVM settings.

vleushin commented 6 years ago

I've created minimal project where it is easy to reproduce it: https://github.com/vleushin/lettuce-oom , takes less than a minute. Sort of illustrates what I have in real -- tons of microservices (GRPC), inside many of them there is Redis backend, so I end up with bunch of Redis clients.

Exception:

febr. 19, 2018 9:16:38 AM io.lettuce.core.protocol.CommandHandler exceptionCaught
WARNING: null Unexpected exception during request: java.lang.NullPointerException
java.lang.NullPointerException

febr. 19, 2018 9:16:38 AM io.lettuce.core.protocol.CommandHandler exceptionCaught
WARNING: null Unexpected exception during request: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 100663296, max: 104857600)
io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 100663296, max: 104857600)
    at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:640)
    at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:594)
    at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:764)
    at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:740)
    at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:244)
    at io.netty.buffer.PoolArena.allocate(PoolArena.java:214)
    at io.netty.buffer.PoolArena.allocate(PoolArena.java:146)
    at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:324)
    at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:185)
    at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:176)
    at io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:137)
    at io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:125)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:886)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.lang.Thread.run(Thread.java:745)

BTW, in some setups I get OutOfDirectMemoryError in some OutOfMemory.

Heapdump is here: https://github.com/vleushin/lettuce-oom/releases/download/0.1.0/heapdump-1519024984196.zip

mp911de commented 6 years ago

Awesome, thanks a lot. I'll take a deeper look as soon as I can.

vleushin commented 6 years ago

Also, sometimes I get this exception:

febr. 19, 2018 10:31:30 AM io.lettuce.core.cluster.ClusterTopologyRefreshScheduler$ClusterTopologyRefreshTask run
WARNING: Cannot refresh Redis Cluster topology
io.lettuce.core.RedisException: Cannot retrieve initial cluster partitions from initial URIs [RedisURI [host='127.0.0.1', port=7000]]
    at io.lettuce.core.cluster.RedisClusterClient.loadPartitions(RedisClusterClient.java:808)
    at io.lettuce.core.cluster.RedisClusterClient.reloadPartitions(RedisClusterClient.java:732)
    at io.lettuce.core.cluster.ClusterTopologyRefreshScheduler$ClusterTopologyRefreshTask.run(ClusterTopologyRefreshScheduler.java:197)
    at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
    at io.netty.util.concurrent.PromiseTask.run(PromiseTask.java:73)
    at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:66)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:886)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.lang.Thread.run(Thread.java:745)
Caused by: io.lettuce.core.RedisConnectionException: Unable to establish a connection to Redis Cluster
    at io.lettuce.core.cluster.topology.AsyncConnections.get(AsyncConnections.java:84)
    at io.lettuce.core.cluster.topology.ClusterTopologyRefresh.loadViews(ClusterTopologyRefresh.java:68)
    at io.lettuce.core.cluster.RedisClusterClient.loadPartitions(RedisClusterClient.java:787)
    ... 8 more
    Suppressed: io.lettuce.core.RedisConnectionException: Unable to connect to 127.0.0.1:7000
        at io.lettuce.core.cluster.topology.ClusterTopologyRefresh.lambda$getConnections$2(ClusterTopologyRefresh.java:226)
        at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
        at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
        at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
        at io.lettuce.core.AbstractRedisClient.lambda$initializeChannelAsync$1(AbstractRedisClient.java:275)
        at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:512)
        at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:486)
        at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:425)
        at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:987)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.ensureOpen(AbstractChannel.java:970)
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:243)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1346)
        at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:545)
        at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:530)
        at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:512)
        at io.netty.channel.DefaultChannelPipeline.connect(DefaultChannelPipeline.java:1004)
        at io.netty.channel.AbstractChannel.connect(AbstractChannel.java:259)
        at io.netty.bootstrap.Bootstrap$3.run(Bootstrap.java:252)
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163)
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
        ... 3 more
    Caused by: java.util.concurrent.CompletionException: java.nio.channels.ClosedChannelException
        at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
        at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
        at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)
        at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
        ... 24 more
    Caused by: java.nio.channels.ClosedChannelException
        at io.netty.channel.AbstractChannel$AbstractUnsafe.ensureOpen(...)(Unknown Source)

This is one I saw initially in logging and was very confused -- because there were valid and working RedisURIs, but this one consequence of OOM.

vleushin commented 6 years ago

@mp911de Hey! Any news? Have you managed to reproduce the problem?

mp911de commented 6 years ago

I'm busy with other things right now. This one gets my attention as one of the next issues.

vleushin commented 6 years ago

Sure, no rush. I turned off refresh for now. Seems no memory leaks, at least on couple day scale.

mp911de commented 6 years ago

I took a look at the heap dump and I have the slight feeling that the memory consumption is a consequence of the runtime environment.

Netty creates tons of threads/event loops – too many if you consider the memory limit of 100MB.

screenshot 2018-02-27 14 59 41

Your heap dump shows the event loops. Looking at your code, you create lots of RedisClusterClient instances which create lots of event loops. Please reuse ClientResources as much as possible, ideally a single instance per JVM.

This will reduce on one hand the number of lingering threads. On the other hand, every netty thread has a thread-local memory buffer pool cache:

io.netty.buffer.PoolThreadCache$SubPageMemoryRegionCache

The more threads you spin up, the more memory caches are allocated and the more memory buffers are contained within such a cache and the more memory you require.

The more threads you use, the more (direct) memory you require. I'm not fully aware since which JVM version cgroup limits (especially processor count) is respected. By default, Lettuce creates 2 * Runtime.availableProcessors() threads. You can limit the number by either setting -Dio.netty.eventLoopThreads=… or via the API: DefaultClientResources.builder().ioThreadPoolSize(4).computationThreadPoolSize(4).

vleushin commented 6 years ago

@mp911de Point taken. Will be using as few ClientResources as possible. For cluster this will work, but for non-cluster I will still need at least one for each DB in Redis, right?

But I still suspect there is the problem. My production nodes (-Xmx=1G) eventually get OutOfDirectMemory anyway, so I turn off periodic refresh. Should I increase memory -- to check if I still be getting OutOfDirectMemory?

Also, this: https://github.com/vleushin/lettuce-oom/commit/49cabc59c4e99b75f3c557d71d06ab9d5b5ad41e#r27648655 . I'm getting this error with single RedisClusterClient and single-node Redis cluster.

mp911de commented 6 years ago

ClientResources encapsulate thread pools, basically. You can share it amongst RedisClient and RedisClusterClient instances within a single JVM. In general, you need a single RedisClient for multiple servers/databases/connections (a singleton in your JVM is fine, too). For clusters, you need a RedisClusterClient instance per cluster each.

If you still run into issues, please use heap-buffers and please provide a heap dump. If there's a problem, we will find and fix it!

mp911de commented 6 years ago

I think we found the cause. The OOME looks related to a connection leak reported in #721.

vleushin commented 6 years ago

@mp911de Great! You can close this ticket, I will follow on #721

mp911de commented 6 years ago

Let's keep that one open until we can be sure #721 will fix the issue.

mp911de commented 6 years ago

A snapshot build of Lettuce 5.0.3 is available now. Care to give it a spin?

vleushin commented 6 years ago

@mp911de Tried my test in https://github.com/vleushin/lettuce-oom . Worked without problems.

Tried with 5.0.2. Indeed, connection count to Redis was growing. With 5.0.3 it was stable.

mp911de commented 6 years ago

Thanks a lot for your response. I'm closing this one as a duplicate of #721.

RaviH commented 5 years ago

@mp911de I am getting the same error message but only when hitting my redis cluster hard via my micronaut micorservice (netty based). I am not sure if it's a lettuce error but seems like it is based on the stack trace below. Any feedback would be appreciated.

I am creating StatefulRedisConnection using code below:

    @Bean(preDestroy = "close")
    @Singleton
    public StatefulRedisClusterConnection<String, String> getRedisClusterConnection() {

            final String clusterUris = getRedisClusterUris();
            logger.info("Redis cluster URIs: {}", clusterUris);
            final List<RedisURI> redisURIS = Arrays.stream(clusterUris.split(","))
                    .map(RedisURI::create)
                    .collect(Collectors.toList());
            final RedisClusterClient redisClusterClient = RedisClusterClient.create(redisURIS);
            return redisClusterClient.connect();
}

Stacktrace:

13:06:46.265 [nioEventLoopGroup-1-15] ERROR i.m.h.s.netty.RoutingInBoundHandler - Unexpected error occurred: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 457179136 byte(s) of direct memory (used: 620756999, max: 1073741824)
java.util.concurrent.CompletionException: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 457179136 byte(s) of direct memory (used: 620756999, max: 1073741824)
        at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
        at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
        at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)
        at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
        at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
        at io.lettuce.core.protocol.AsyncCommand.doCompleteExceptionally(AsyncCommand.java:137)
        at io.lettuce.core.protocol.AsyncCommand.completeExceptionally(AsyncCommand.java:130)
        at io.lettuce.core.cluster.ClusterCommand.completeExceptionally(ClusterCommand.java:97)
        at io.lettuce.core.protocol.CommandHandler.exceptionCaught(CommandHandler.java:216)
        at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285)
        at io.netty.channel.AbstractChannelHandlerContext.notifyHandlerException(AbstractChannelHandlerContext.java:850)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:364)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)
Caused by: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 457179136 byte(s) of direct memory (used: 620756999, max: 1073741824)
        at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:640)
        at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:594)
        at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:764)
        at io.netty.buffer.PoolArena$DirectArena.newUnpooledChunk(PoolArena.java:754)
        at io.netty.buffer.PoolArena.allocateHuge(PoolArena.java:260)
        at io.netty.buffer.PoolArena.allocate(PoolArena.java:231)
        at io.netty.buffer.PoolArena.reallocate(PoolArena.java:397)
        at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:118)
        at io.netty.buffer.AbstractByteBuf.ensureWritable0(AbstractByteBuf.java:285)
        at io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:265)
        at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1079)
        at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1072)
        at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1062)
        at io.lettuce.core.protocol.CommandHandler.channelRead(CommandHandler.java:506)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        ... 22 common frames omitted
mp911de commented 5 years ago

OOME can happen due to high memory requirements. Lettuce uses the shared memory allocator which means that it potentially shares direct memory with other infrastructure components.

Please check for memory requirements, ideally switch to heap buffers and collect a heap dump. If you can't solve the issue yourself, please file an issue along with some debug context.

RaviH commented 5 years ago

Please check for memory requirements, ideally switch to heap buffers and collect a heap dump.

Can you please advise how to do above? Any best practices page, common pitfalls or any other documentation so that I can understand how to tweak configuration such that I get best out of Lettuce?

Thanks in advance @mp911de

asgs commented 5 years ago

@RaviH to switch to heap-based buffers, the JVM option below needs to be provided. Make sure your Xmx accounts for this additional heap space. -Dio.netty.noPreferDirect=true

gezhiwei8899 commented 5 years ago
Caused by: io.lettuce.core.RedisException: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 1367343104 byte(s) of direct memory (used: 1396703232, max: 2762997760)
    at io.lettuce.core.LettuceFutures.awaitOrCancel(LettuceFutures.java:129) ~[lettuce-core-5.1.3.RELEASE.jar!/:na]
    at io.lettuce.core.FutureSyncInvocationHandler.handleInvocation(FutureSyncInvocationHandler.java:69) ~[lettuce-core-5.1.3.RELEASE.jar!/:na]
    at io.lettuce.core.internal.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:80) ~[lettuce-core-5.1.3.RELEASE.jar!/:na]
    at com.sun.proxy.$Proxy175.get(Unknown Source) ~[na:na]
    at org.springframework.data.redis.connection.lettuce.LettuceStringCommands.get(LettuceStringCommands.java:66) ~[spring-data-redis-2.1.3.RELEASE.jar!/:2.1.3.RELEASE]
    ... 90 common frames omitted
Caused by: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 1367343104 byte(s) of direct memory (used: 1396703232, max: 2762997760)
    at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:656) ~[netty-common-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:610) ~[netty-common-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:764) ~[netty-buffer-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.buffer.PoolArena$DirectArena.newUnpooledChunk(PoolArena.java:754) ~[netty-buffer-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.buffer.PoolArena.allocateHuge(PoolArena.java:260) ~[netty-buffer-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.buffer.PoolArena.allocate(PoolArena.java:231) ~[netty-buffer-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.buffer.PoolArena.reallocate(PoolArena.java:397) ~[netty-buffer-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:118) ~[netty-buffer-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.buffer.AbstractByteBuf.ensureWritable0(AbstractByteBuf.java:299) ~[netty-buffer-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:278) ~[netty-buffer-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1103) ~[netty-buffer-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1096) ~[netty-buffer-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1087) ~[netty-buffer-4.1.31.Final.jar!/:4.1.31.Final]
    at io.lettuce.core.protocol.CommandHandler.channelRead(CommandHandler.java:554) ~[lettuce-core-5.1.3.RELEASE.jar!/:na]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:648) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:583) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:500) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462) ~[netty-transport-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897) ~[netty-common-4.1.31.Final.jar!/:4.1.31.Final]
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.31.Final.jar!/:4.1.31.Final]
gezhiwei8899 commented 5 years ago

I have a similar problem。

gezhiwei8899 commented 5 years ago

image

mp911de commented 5 years ago

@gezhiwei8899 Can you try updating to 5.2.0 snapshots to check whether the changes in 5.2 address your issue?

gezhiwei8899 commented 5 years ago

It is not easy for me to update code and do the test,We have Million users.

But I see the differences.

image

I see the changes,5.1.8.RELEASE.

image

So lettuce update netty version?

gezhiwei8899 commented 5 years ago

image image

@mp911de I test the lastest version of lettuce 5.2.0build. Direct Memory OOM yet.

VINO42 commented 3 years ago

I got the same problem: failed to allocate 16777216 byte(s) o f direct memory (used: 2147483648, max: 2147483648) io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 2147483648, max: 2147483648) at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:754) ~[netty-common-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:709) ~[netty-common-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:755) ~[netty-buffer-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:731) ~[netty-buffer-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:247) ~[netty-buffer-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.buffer.PoolArena.allocate(PoolArena.java:215) ~[netty-buffer-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.buffer.PoolArena.allocate(PoolArena.java:147) ~[netty-buffer-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:356) ~[netty-buffer-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187) ~[netty-buffer-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178) ~[netty-buffer-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.channel.unix.PreferredDirectByteBufAllocator.ioBuffer(PreferredDirectByteBufAllocator.java:53) ~[netty-all-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114) ~[netty-transport-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.channel.epoll.EpollRecvByteAllocatorHandle.allocate(EpollRecvByteAllocatorHandle.java:75) ~[netty-all-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:777) [netty-all-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475) [netty-all-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) [netty-all-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.51.Final.jar!/:4.1.51.Final] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.51.Final.jar!/:4.1.51.Final] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]

trying using : -Dio.netty.noPreferDirect=true

using lettuce:5.3.2.RELEASE

NicholasII commented 3 years ago

I had the same problem,use 5.2.0,has the problem been solved?

VINO42 commented 3 years ago

I had the same problem,use 5.2.0,has the problem been solved?

I changed to redisson client in spring boot projects

NicholasII commented 3 years ago

2021-02-23 06:53:38,798 ERROR --- [pool-12-thread-1] org.springframework.scheduling.support.TaskUtils$LoggingErrorHandler : Unexpected error occurred in scheduled task. org.springframework.data.redis.RedisSystemException: Redis exception; nested exception is io.lettuce.core.RedisException: io.netty.handler.codec.EncoderException: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 1056964612, max: 1073741824) at org.springframework.data.redis.connection.lettuce.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:74) at org.springframework.data.redis.connection.lettuce.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:41) at org.springframework.data.redis.PassThroughExceptionTranslationStrategy.translate(PassThroughExceptionTranslationStrategy.java:44) at org.springframework.data.redis.FallbackExceptionTranslationStrategy.translate(FallbackExceptionTranslationStrategy.java:42) at org.springframework.data.redis.connection.lettuce.LettuceConnection.convertLettuceAccessException(LettuceConnection.java:257) at org.springframework.data.redis.connection.lettuce.LettuceStringCommands.convertLettuceAccessException(LettuceStringCommands.java:718) at org.springframework.data.redis.connection.lettuce.LettuceStringCommands.set(LettuceStringCommands.java:143) at org.springframework.data.redis.connection.DefaultedRedisConnection.set(DefaultedRedisConnection.java:231) at org.springframework.data.redis.core.DefaultValueOperations$3.inRedis(DefaultValueOperations.java:202) at org.springframework.data.redis.core.AbstractOperations$ValueDeserializingRedisCallback.doInRedis(AbstractOperations.java:59) at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:224) at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:184) at org.springframework.data.redis.core.AbstractOperations.execute(AbstractOperations.java:95) at org.springframework.data.redis.core.DefaultValueOperations.set(DefaultValueOperations.java:198) at com.qdcares.mobile.service.web.cache.FlightCacheImpl.cacheAllFlight(FlightCacheImpl.java:112) at com.qdcares.mobile.service.web.cache.FlightCacheImpl.fullRefreshFlightCache(FlightCacheImpl.java:108) at com.qdcares.mobile.service.data.task.MessageHandleSchedule.executeHandleFullFlightMessage(MessageHandleSchedule.java:56) at com.qdcares.mobile.service.data.task.MessageHandleSchedule.executeFullFlightTask(MessageHandleSchedule.java:39) at sun.reflect.GeneratedMethodAccessor764.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84) at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.runAndReset(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: io.lettuce.core.RedisException: io.netty.handler.codec.EncoderException: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 1056964612, max: 1073741824) at io.lettuce.core.LettuceFutures.awaitOrCancel(LettuceFutures.java:129) at io.lettuce.core.cluster.ClusterFutureSyncInvocationHandler.handleInvocation(ClusterFutureSyncInvocationHandler.java:123) at io.lettuce.core.internal.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:80) at com.sun.proxy.$Proxy192.set(Unknown Source) at org.springframework.data.redis.connection.lettuce.LettuceStringCommands.set(LettuceStringCommands.java:141) ... 23 common frames omitted Caused by: io.netty.handler.codec.EncoderException: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 1056964612, max: 1073741824) at io.netty.handler.codec.MessageToByteEncoder.write(MessageToByteEncoder.java:125) at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738) at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730) at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:816) at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723) at io.lettuce.core.protocol.CommandHandler.writeSingleCommand(CommandHandler.java:421) at io.lettuce.core.protocol.CommandHandler.write(CommandHandler.java:353) at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738) at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730) at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38) at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1081) at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1128) at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1070) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:446) at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ... 1 common frames omitted Caused by: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 1056964612, max: 1073741824) at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:640) at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:594) at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:764) at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:740) at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:244) at io.netty.buffer.PoolArena.allocate(PoolArena.java:226) at io.netty.buffer.PoolArena.reallocate(PoolArena.java:397) at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:118) at io.netty.buffer.AbstractByteBuf.ensureWritable0(AbstractByteBuf.java:285) at io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:265) at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1079) at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1072) at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1062) at io.lettuce.core.protocol.CommandArgs$ByteBufferArgument.writeByteBuf(CommandArgs.java:473) at io.lettuce.core.protocol.CommandArgs$ValueArgument.encode(CommandArgs.java:703) at io.lettuce.core.protocol.CommandArgs.encode(CommandArgs.java:346) at io.lettuce.core.protocol.Command.encode(Command.java:119) at io.lettuce.core.protocol.AsyncCommand.encode(AsyncCommand.java:185) at io.lettuce.core.cluster.ClusterCommand.encode(ClusterCommand.java:92) at io.lettuce.core.protocol.CommandEncoder.encode(CommandEncoder.java:90) at io.lettuce.core.protocol.CommandEncoder.encode(CommandEncoder.java:75) at io.netty.handler.codec.MessageToByteEncoder.write(MessageToByteEncoder.java:107) ... 18 common frames omitted

using lecctuce:5.2.0.RELEASE

NicholasII commented 3 years ago

changed to redisson client in spring boot projects

redission is ok ?

VINO42 commented 3 years ago

changed to redisson client in spring boot projects

redission is ok ?

yep

mp911de commented 3 years ago

Instead of changing the client, configure -Dio.netty.noPreferDirect=true to use heap buffers if you run out of configured direct memory.

NicholasII commented 3 years ago

Instead of changing the client, configure -D=io.netty.noPreferDirect=true to use heap buffers if you run out of configured direct memory.

thanks

jonsmithers commented 3 years ago

Setting -XX:MaxDirectMemorySize helped us with this issue. We discovered that our platform was capping direct memory (related to https://github.com/cloudfoundry/java-buildpack-memory-calculator/issues/17).

vpavic commented 2 years ago

@mp911de Note that two of your comments have the same typo (an unnecessary = after -D) in the -Dio.netty.noPreferDirect=true system property. Consider updating those as someone might end up with an invalid configuration by doing c&p.

vpavic commented 2 years ago

Not sure if anything can be done about it from the Lettuce side, but one scenario that grows direct memory usage linearly (and hits the 10 MB cap set by CF memory calculator quite easy) can be reproduced using only spring-boot-starter-data-redis-reactive and something like this:

@SpringBootApplication
public class SampleApplication implements CommandLineRunner {

    public static void main(String[] args) {
        SpringApplication.run(SampleApplication.class, args);
    }

    @Autowired
    private ReactiveStringRedisTemplate redisTemplate;

    @Override
    public void run(String... args) throws Exception {
        for (int i = 0; i < 200; ++i) {
            Thread.sleep(100L);
            this.redisTemplate.listenToChannel("test").log().subscribe();
        }
    }

}

Screenshot from 2022-01-07 13-20-57

Starting the JVM with -Dio.netty.noPreferDirect=true naturally addresses the issue.

mp911de commented 2 years ago

redisTemplate.listenToChannel("test") creates a new ReactiveRedisMessageListenerContainer that allocates a new connection. Each connection allocates two buffers for response processing (64kb + 1 kb) that map roughly to 13 MB of memory for your code example.

vpavic commented 2 years ago

Yep, that's why I opened spring-projects/spring-data-redis#2229 to consider documentation improvements for #listenTo variants as they somewhat set up users for failure.

yyyhhx commented 1 year ago

@gezhiwei8899 hello,Have you solved this problem

DourFeng commented 1 year ago

@mp911de Hello, Have you solved this problem, using heap memory, the performance will drop

mshahmoradi87 commented 2 weeks ago

@mp911de we have same issue here,

<netty.version>4.1.109.Final</netty.version>
<lettuce.version>6.3.2.RELEASE</lettuce.version>

and below configs

    useSsl: true
    maxTotal: 400
    subscriptionsMaxTotal: 400
    timeout: 5s
    connectTimeout: 500ms
    nettyThreads: 64

We use aws redis serverless, we have 12 tasks running and for some reason, one tasks becomes faulty, and we start to have continues "Redis command timed out" on this single task.

Caused by: io.lettuce.core.RedisCommandTimeoutException: Command timed out after 2 second(s)

at io.lettuce.core.internal.ExceptionFactory.createTimeoutException(ExceptionFactory.java:59)

at io.lettuce.core.internal.Futures.awaitOrCancel(Futures.java:246)

at io.lettuce.core.LettuceFutures.awaitOrCancel(LettuceFutures.java:74)

When this happens long for "CachePut", it exhausts Direct Memory, and we get OOM.

org.springframework.data.redis.RedisSystemException: Redis exception; nested exception is io.lettuce.core.RedisException: java.lang.OutOfMemoryError: Cannot reserve 4194304 bytes of direct buffer memory (allocated: 786254691, limit: 786432000)
....

Caused by: io.lettuce.core.RedisException: java.lang.OutOfMemoryError: Cannot reserve 4194304 bytes of direct buffer memory (allocated: 786254691, limit: 786432000)

at io.lettuce.core.internal.Exceptions.bubble(Exceptions.java:83)

at io.lettuce.core.internal.Futures.awaitOrCancel(Futures.java:250)

at io.lettuce.core.LettuceFutures.awaitOrCancel(LettuceFutures.java:74)

Any idea how this can be fixed ?

Also, this is side issue and client side. I will open another issue for lettuce <---> redis serverless issue, I guess this is an edge case, while it does not happen continiusly.

tishun commented 2 weeks ago

@mp911de we have same issue here,

Hey @mshahmoradi87 ,

the issue is that the shared memory could be exhausted for a number of (outside) reasons. The suggestion @mp911de had tried to address that by limiting the memory constraints to the JVM that is running the Lettuce driver and thus proving the memory leak is in the driver and not some other component.

If - with heap memory - the problem remains, then we can try and do a heap dump and see which are the offending objects and their parents, to try to address the issue.

Without it we can't trace the issue. Do you think you can do that?

tishun commented 2 weeks ago

@mp911de Hello, Have you solved this problem, using heap memory, the performance will drop

Hey @DourFeng,

assuming we have an issue to begin with. Could you help us trace this with a heap dump?

It might not be clear but the suggestion to use the heap is not meant to "fix" the issue, but to trace it.

tishun commented 2 weeks ago

@gezhiwei8899 hello,Have you solved this problem

Same here. We need to trace the problem if there is a problem in the driver to begin with. Direct memory is shared and any other running process could be causing an OOM issue.

mshahmoradi87 commented 2 weeks ago

@mp911de we have same issue here,

Hey @mshahmoradi87 ,

the issue is that the shared memory could be exhausted for a number of (outside) reasons. The suggestion @mp911de had tried to address that by limiting the memory constraints to the JVM that is running the Lettuce driver and thus proving the memory leak is in the driver and not some other component.

If - with heap memory - the problem remains, then we can try and do a heap dump and see which are the offending objects and their parents, to try to address the issue.

Without it we can't trace the issue. Do you think you can do that?

Hi, I have tried to simulate this in test env, but no luck yet ! What I could do is to enable direct memory profiling in production env, here is a method trace when I have timeout with CachePut error. I hope this helps.

Screenshot 2024-09-11 at 09 57 48

mshahmoradi87 commented 1 week ago

@tishun does the above help? Would you reopen the issue, or should we create a new one ?

tishun commented 6 hours ago

@tishun does the above help? Would you reopen the issue, or should we create a new one ?

Let's reopen the issue and I will try to take another look when I have the time