RejectedExecutionException in SequentialRestRateLimiter

freyacodes commented 3 weeks ago

General Troubleshooting

[X] I have checked for similar issues on the Issue-tracker.
[X] I have checked for PRs that might already address this issue.

Version of JDA

5.0.0-beta.24

Expected Behaviour

I expected SequentialRestRateLimiter to process requests.

Instead, it seems that any RestActions that are queued produce this exception. In production, we use Docker containers that contain a JVM using JDA. We have successfully released beta 24 on FredBoat, however we are encountering this exception on just one of our smaller servers/bots using the same container.

Code Example for Reproduction Steps

N/A

Code for JDABuilder or DefaultShardManagerBuilder used

val intents = listOf(
    GUILD_VOICE_STATES,
    GUILD_MESSAGES,
    GUILD_MODERATION,
    DIRECT_MESSAGES
)

DefaultShardManagerBuilder.create(sentinelProperties.discordToken, intents)
            .disableCache(CacheFlag.ACTIVITY, CacheFlag.EMOJI, CacheFlag.CLIENT_STATUS)
            .setBulkDeleteSplittingEnabled(false)
            .setEnableShutdownHook(false)
            .setAutoReconnect(true)
            .setShardsTotal(sentinelProperties.shardCount)
            .setShards(sentinelProperties.shardStart, sentinelProperties.shardEnd)
            .setSessionController(sessionController)
            .setVoiceDispatchInterceptor(voiceInterceptor)
            .setRawEventsEnabled(true)
            .addEventListeners(eventListener)
            .build()

Exception or Error

2024-07-01T09:27:34.356Z  WARN 1 --- [ultShardManager] n.d.j.a.utils.SessionControllerAdapter   : Using outdated implementation of global rate-limit handling. It is recommended to use GlobalRateLimit interface instead!
2024-07-01T09:27:34.358Z ERROR 1 --- [eduler-Worker 1] n.d.jda.api.requests.RestRateLimiter     : Failed to execute bucket worker

java.util.concurrent.RejectedExecutionException: Task uninit+GET/gateway/bot:n/a rejected from java.util.concurrent.ThreadPoolExecutor@75a9cc24[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 8]
    at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2081) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:841) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1376) ~[na:na]
    at net.dv8tion.jda.api.requests.SequentialRestRateLimiter.scheduleElastic(SequentialRestRateLimiter.java:239) ~[JDA-5.0.0-beta.24.jar!/:5.0.0-beta.24]
    at net.dv8tion.jda.api.requests.SequentialRestRateLimiter.lambda$runBucket$8(SequentialRestRateLimiter.java:262) ~[JDA-5.0.0-beta.24.jar!/:5.0.0-beta.24]
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:577) ~[na:na]
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317) ~[na:na]
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[na:na]
    at java.base/java.lang.Thread.run(Thread.java:1623) ~[na:na]

2024-07-01T09:27:34.367Z ERROR 1 --- [eduler-Worker 2] n.d.jda.api.requests.RestRateLimiter     : Failed to execute bucket worker

java.util.concurrent.RejectedExecutionException: Task 3df15bae86f6647dd4dfcbd5c6949480:channel_id=1067689190243979457 rejected from java.util.concurrent.ThreadPoolExecutor@75a9cc24[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 8]
    at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2081) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:841) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1376) ~[na:na]
    at net.dv8tion.jda.api.requests.SequentialRestRateLimiter.scheduleElastic(SequentialRestRateLimiter.java:239) ~[JDA-5.0.0-beta.24.jar!/:5.0.0-beta.24]
    at net.dv8tion.jda.api.requests.SequentialRestRateLimiter.lambda$runBucket$8(SequentialRestRateLimiter.java:262) ~[JDA-5.0.0-beta.24.jar!/:5.0.0-beta.24]
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:577) ~[na:na]
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317) ~[na:na]
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[na:na]
    at java.base/java.lang.Thread.run(Thread.java:1623) ~[na:na]

freyacodes commented 3 weeks ago

Given that the toString() method of the scheduler indicates that it was terminated, what could make the elastic scheduler terminate early in the life of the DefaultShardManager? The GET/gateway/bot mention implies that the termination happened before line 573 in DefaultShardManager.java:573 (JDAImpl buildInstance(shardId)). There is another call to this route, but that is done with a disposable single threaded scheduler.

Unfortunately the full logs are lost to time, and I would like to avoid deploying a build that I know to be broken.

MinnDevelopment commented 3 weeks ago

Is this issue happening consistently for you?

freyacodes commented 3 weeks ago

It was consistently happening upon restarting the JVM after it first occurred. However, I deployed today without the problem occurring, meaning that the issue simply went away without explanation.

You are free to close this issue if you wish. No one else seems to have experienced this issue.

MinnDevelopment commented 3 weeks ago

I don't consider the issue fixed, so I'll keep it open hoping to get more information to track down the underlying problem.

freyacodes commented 1 week ago

I found a way to reproduce. It seems deceptively simple:

Start a DefaultShardManager with default threading (as shown above).
Invoke DefaultShardManager#restart(shardId)

And ker-bloom. Every shard stops being able to process RestActions. The expected behavior is to make the shard reconnect.

When a single shard is restarted, it shuts down its executors. This is quite destructive given that executors are shared between JDAImpls when sharding.

Here is one terminated executor being referenced from 4 shards:

MinnDevelopment commented 1 week ago

Found the problem, looks like the threading config incorrectly applied the automatic shutdown flag. I'll publish a hotfix release later today.

MinnDevelopment commented 1 week ago

Hopefully this is fixed in 5.0.1.

discord-jda / JDA