tarantool / cartridge-java

Tarantool Cartridge Java driver for Tarantool versions 1.10+ based on Netty framework
https://tarantool.io
Other
27 stars 11 forks source link

Cannot reserve 104857600 bytes of direct buffer memory #448

Open seet61 opened 9 months ago

seet61 commented 9 months ago

If you work with high load project, and start you app -XX:ReservedCodeCacheSize=512m -XX:MaxMetaspaceSize=512m -XX:MaxDirectMemorySize=256m, after small period 1-2 hours yoe got exception and hisgh cpu of VM.

Without keys everithing is ok

14-12-2023 11:47:53.847 [nioEventLoopGroup-2-3] [] WARN  i.n.channel.DefaultChannelPipeline.onUnhandledInboundException - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler
 in the pipeline did not handle the exception.
java.lang.OutOfMemoryError: Cannot reserve 104857600 bytes of direct buffer memory (allocated: 266246492, limit: 268435456)
        at java.base/java.nio.Bits.reserveMemory(Bits.java:178)
        at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:119)
        at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:320)
        at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:645)
        at io.netty.buffer.PoolArena$DirectArena.newUnpooledChunk(PoolArena.java:635)
        at io.netty.buffer.PoolArena.allocateHuge(PoolArena.java:215)
        at io.netty.buffer.PoolArena.allocate(PoolArena.java:143)
        at io.netty.buffer.PoolArena.reallocate(PoolArena.java:288)
        at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:118)
        at io.netty.buffer.AbstractByteBuf.ensureWritable0(AbstractByteBuf.java:307)
        at io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:282)
        at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1105)
        at io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:99)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:274)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:830)

Dependencies:

<artifactId>spring-boot-starter-parent</artifactId>
<version>2.3.5.RELEASE</version>

and 

<!-- tarantool -->
<dependency>
    <groupId>io.tarantool</groupId>
    <artifactId>cartridge-driver</artifactId>
    <version>0.13.0</version>
</dependency>

CPU of VM at problem time image

ArtDu commented 9 months ago

It can be fixed after merging this PR https://github.com/tarantool/cartridge-java/pull/438 . But need to be tested.

@seet61 can you give more details about requests? what types of requests? how many of them? what is the ~packet size? Do you use ProxyTarantoolClient? If you then how many connections between client and cluster?

It could help to reproduce this behavior.

seet61 commented 9 months ago

@ArtDu thanks, I will be waiting of them.

My project use tarantool cartridge claster for caching huge responses of master system, SOAP responses from 10kB to 5 MB. Current storage of cluster is 12 routers at 12 VM and 2 VM with 6 replica set of storages.

I'm using TarantoolClient from docs example

    private List<TarantoolServerAddress> getShuffledTarantoolServerAddresses() {
        List<TarantoolServerAddress> addresses = tarantoolConfiguration.getRouters().stream()
                .map(router -> new TarantoolServerAddress(router.split(":")[0], Integer.parseInt(router.split(":")[1])))
                .collect(Collectors.toList());
        log.debug("addresses: " + addresses);
        Collections.shuffle(addresses);
        return addresses;
    }

    @Bean
    public TarantoolClient<TarantoolTuple, TarantoolResult<TarantoolTuple>> tarantoolClient() {
        return TarantoolClientFactory.createClient()
                // You can connect to multiple routers
                // Do not forget to shuffle your addresses if you are using multiple clients
                .withAddresses(getShuffledTarantoolServerAddresses())
                // For connecting to a Cartridge application,
                // use the value of cluster_cookie parameter in the init.lua file
                .withCredentials(tarantoolConfiguration.getUserName(), tarantoolConfiguration.getUserPassword())
                // Number of connections per Tarantool instance
                .withConnections(tarantoolConfiguration.getConnectCount())
                // Specify using the default CRUD proxy operations mapping configuration
                .withProxyMethodMapping()
                .withConnectionSelectionStrategy(PARALLEL_ROUND_ROBIN)
                .withRetryingByNumberOfAttempts(3)
                /*.withRetryingByNumberOfAttempts(5, throwable -> throwable.getMessage().equals("Some error"),
                        policy -> policy.withDelay(500))*/
                .withConnectTimeout(tarantoolConfiguration.getConnectTimeout())
                .withReadTimeout(tarantoolConfiguration.getReadTimeout())
                .withRequestTimeout(tarantoolConfiguration.getRequestTimeout())
                .build();
    }

Each client initialise 10 connection to each router. If you need additional information you could send me a message at tg by nickname.