redisson / redisson

Redisson - Valkey & Redis Java client. Complete Real-Time Data Platform. Sync/Async/RxJava/Reactive API. Over 50 Redis or Valkey based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Bloom filter, Spring, Tomcat, Scheduler, JCache API, Hibernate, RPC, local cache..
https://redisson.pro
Apache License 2.0
23.34k stars 5.36k forks source link

IllegalMonitorStateException on unlock() Method After Upgrading Redisson Version #6248

Open fyakghooa opened 1 week ago

fyakghooa commented 1 week ago

Hi redisson team Our team recently upgraded Redisson in our service from version 3.15.1 to 3.35.0. Since the upgrade, we have observed a significant increase in occurrences of the following error upon calling unlock():

Stack trace ```java java.lang.IllegalMonitorStateException: attempt to unlock lock, not locked by current thread by node id: 0a712830-70bf-4bdc-848c-5dc79e75fe79 thread-id: 122 at org.redisson.RedissonBaseLock.lambda$unlockAsync0$2(RedissonBaseLock.java:309) ~[redisson-3.35.0.jar:3.35.0] at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836) ~[?:1.8.0_422] at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:811) ~[?:1.8.0_422] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_422] at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) ~[?:1.8.0_422] at org.redisson.command.RedisExecutor.handleSuccess(RedisExecutor.java:648) ~[redisson-3.35.0.jar:3.35.0] at org.redisson.command.RedisExecutor.handleResult(RedisExecutor.java:624) ~[redisson-3.35.0.jar:3.35.0] at org.redisson.command.RedisExecutor.checkAttemptPromise(RedisExecutor.java:601) ~[redisson-3.35.0.jar:3.35.0] at org.redisson.command.RedisExecutor.lambda$execute$5(RedisExecutor.java:201) ~[redisson-3.35.0.jar:3.35.0] at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) ~[?:1.8.0_422] at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) ~[?:1.8.0_422] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_422] at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) ~[?:1.8.0_422] at org.redisson.client.handler.CommandDecoder.completeResponse(CommandDecoder.java:530) ~[redisson-3.35.0.jar:3.35.0] at org.redisson.client.handler.CommandDecoder.handleResult(CommandDecoder.java:524) ~[redisson-3.35.0.jar:3.35.0] at org.redisson.client.handler.CommandDecoder.decode(CommandDecoder.java:444) ~[redisson-3.35.0.jar:3.35.0] at org.redisson.client.handler.CommandDecoder.decodeCommand(CommandDecoder.java:218) ~[redisson-3.35.0.jar:3.35.0] at org.redisson.client.handler.CommandDecoder.decode(CommandDecoder.java:146) ~[redisson-3.35.0.jar:3.35.0] at org.redisson.client.handler.CommandDecoder.decode(CommandDecoder.java:122) ~[redisson-3.35.0.jar:3.35.0] at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501) ~[netty-codec-4.1.51.Final.jar:4.1.51.Final] at io.netty.handler.codec.ReplayingDecoder.callDecode(ReplayingDecoder.java:366) ~[netty-codec-4.1.51.Final.jar:4.1.51.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) ~[netty-codec-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) ~[netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) ~[netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650) ~[netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) ~[netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) ~[netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[netty-common-4.1.51.Final.jar:4.1.51.Final] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.51.Final.jar:4.1.51.Final] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.51.Final.jar:4.1.51.Final] at java.lang.Thread.run(Thread.java:750) [?:1.8.0_422] ```

Context and Testing Steps:

Unit Test Code

Unit Test Code ```java @Test @SneakyThrows void testLock() { AtomicInteger unLockFailedTimes = new AtomicInteger(0); ForkJoinPool forkJoinPool = new ForkJoinPool(20); forkJoinPool.submit(() -> IntStream.range(0, 10000) .parallel() .forEach(i -> { RLock lock = redissonClient.getLock(UUID.randomUUID().toString()); lock.lock(100, TimeUnit.MILLISECONDS); // do the task ... if (lock.isLocked() && lock.isHeldByCurrentThread()) { try { lock.unlock(); } catch (IllegalMonitorStateException ex) { unLockFailedTimes.incrementAndGet(); } } })).get(); Version.logVersion(); log.info("unlock failed times : {}", unLockFailedTimes.get()); } ```

Observed Results by Version:

Questions:

Redisson config

Environment

Project for reproduce this issue (minimal reproducible code example)

And here is the project included the maven dependencies, redisson config and other config, I hope this will help in reproducing the issue.

https://github.com/fyakghooa/redisson-lock-demo/blob/main/src/test/java/com/example/redisson/lock/demo/RedissonLockTest.java

Thank you very much for your time and assistance with this issue!

mrniko commented 1 week ago

Unable to reproduce.

    @Test
    public void test1() throws ExecutionException, InterruptedException {
        testInCluster(client -> {
            AtomicInteger times = new AtomicInteger(0);

            AtomicInteger unLockFailedTimes = new AtomicInteger(0);
            ForkJoinPool forkJoinPool = new ForkJoinPool(20);
            try {
                forkJoinPool.submit(() -> IntStream.range(0, 10000)
                        .parallel()
                        .forEach(i -> {
                            times.incrementAndGet();
                            RLock lock = redisson.getLock(UUID.randomUUID().toString());
                            lock.lock(100, TimeUnit.MILLISECONDS);
                            // do the task ...
                            if (lock.isLocked() && lock.isHeldByCurrentThread()) {
                                try {
                                    lock.unlock();
                                } catch (IllegalMonitorStateException ex) {
                                    unLockFailedTimes.incrementAndGet();
                                }
                            }
                        })).get();
            } catch (InterruptedException e) {
                throw new RuntimeException(e);
            } catch (ExecutionException e) {
                throw new RuntimeException(e);
            }
            Version.logVersion();
            System.out.println("total times " + times.get() + " unlock failed times : " + unLockFailedTimes.get());
        });
    }

output: total times 10000 unlock failed times : 0

I added this test to RedissonLockTest class. Latest Redisson version.

JSterling8 commented 1 week ago

@mrniko - Thanks for taking a look. What JDK version did you run this on? We ran this test, and several variants of it, on many JDK versions and the results were as follows:

image

mrniko commented 1 week ago

java version "21.0.4" 2024-07-16 LTS Java(TM) SE Runtime Environment (build 21.0.4+8-LTS-274) Java HotSpot(TM) 64-Bit Server VM (build 21.0.4+8-LTS-274, mixed mode, sharing)

It's Oracle JDK. Can you try it?

lyrric commented 6 days ago

The smaller the leaseTime parameter, the more failures there are. Maybe the timer doesn't have enough time to renew expireation(influenced by jdk, system, enviroment)? If you set leaseTime parameter to 1 miniute, it won't happen.

mrniko commented 5 days ago

I think this is the cause. With high CPU load some threads can be executed very slow introducing such latency.

fyakghooa commented 1 day ago

Hi @mrniko @lyrric Thank you for the information you provided.

I ran tests on Redisson version 3.37.0 with OpenJDK 1.8, adjusting the leaseTime to 1 second, which significantly reduced the number of unlock failures.

Additionally, may I ask what system specifications you used during your testing? For my tests, I used an Apple M2 with an 8-core CPU.

Once again, thank you for your help and response. Much appreciated!