GoogleCloudPlatform / spring-cloud-gcp

New home for Spring Cloud GCP development starting with version 2.0.
Apache License 2.0
423 stars 315 forks source link

RejectedExecutionException during Graceful Shutdown when publishing with PubSubPublisherTemplate #2721

Closed ernestaskardzys closed 7 months ago

ernestaskardzys commented 7 months ago

Describe the bug

I use the following versions:

I have a Kotlin application, that accepts POST requests from clients and sends these requests to GCP Pub/Sub topic. Application works well during graceful shutdown, when there is low load. However, when there is high load and application is being gracefully shutdown, I get lots of errors:

2024-03-19T11:27:17.672+02:00  WARN 15266 --- [thread-error] [lt-executor-871] c.g.c.s.p.c.p.PubSubPublisherTemplate    : Publishing to event-topic topic failed.

java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@1f236410[Not completed, task = java.util.concurrent.Executors$RunnableAdapter@50d99a40[Wrapped task = TrustedListenableFutureTask@5e1e6e2b[status=PENDING, info=[task=[running=[NOT STARTED YET], com.google.api.gax.rpc.AttemptCallable@5a21f130]]]]] rejected from org.springframework.scheduling.concurrent.ThreadPoolTaskScheduler$1@5887be8e[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1655]
        at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2065) ~[na:na]
        at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:833) ~[na:na]
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:340) ~[na:na]

It appears that ThreadPoolTaskScheduler from GcpPubSubAutoConfiguration has been already shut down, therefore in-flight requests can't be completed, resulting in errors.

Could someone please advice on what to do in the above case?

Workaround (?)

As a workaround, it appears that code snippet, suggested on https://github.com/spring-attic/spring-cloud-gcp/issues/1503 helps.

Sample

I have created a sample project (please see attached ZIP file) with instructions on how to start it and how to reproduce the issue in the README.md file.

thread-error.zip

meltsufin commented 7 months ago

I suppose that any publishers that have been created need to be shutdown first. Are you using PublisherFactory directly or through PubSubTemplate?

ernestaskardzys commented 7 months ago

Through PubSubTemplate.

meltsufin commented 7 months ago

CachingPublisherFactory, which should be implicitly used, does have automatic publisher shutdown. Would you be able to provide an reproducer project to help debug this further?

ernestaskardzys commented 7 months ago

Thanks a lot for the reply.

Please see the attached zip in my first message - it contains small project with steps to reproduce.

meltsufin commented 7 months ago

Sorry, I missed that! Thank you for providing the sample project. I also found this related issue: https://github.com/GoogleCloudPlatform/spring-cloud-gcp/issues/956.

ernestaskardzys commented 7 months ago

Thanks for the tip - tried it, but it did not help.

jayakumarc commented 7 months ago

As of spring-6.1.x release, ThreadPoolTaskScheduler shuts down immediately on ContextClosedEvent, thereby rejecting any further task submissions. We could circumvent the issue by creating the pubsubPublisherThreadPool bean with one of the late shutdown options of ThreadPoolTaskScheduler.

  @Bean(name = "pubsubPublisherThreadPool")
  public ThreadPoolTaskScheduler pubsubPublisherThreadPool(
      GcpPubSubProperties properties) {
    ThreadPoolTaskScheduler scheduler = new ThreadPoolTaskScheduler();
    scheduler.setPoolSize(properties.getPublisher().getExecutorThreads());
    scheduler.setThreadNamePrefix("gcp-pubsub-publisher");
    scheduler.setDaemon(true);
    scheduler.setAcceptTasksAfterContextClose(true);
    return scheduler;
  }

Proposed a PR(#2738) to supply late shutdown config options, such as:

spring.cloud.gcp.pubsub.publisher.executor-accept-tasks-after-context-close=true
meltsufin commented 7 months ago

Thanks for the suggestion and the PR @jayakumarc! @ernestaskardzys Would you mind trying this suggestion and letting us know if it helped?

ernestaskardzys commented 7 months ago

@meltsufin / @jayakumarc , thanks for comment and code example.

I can confirm, that the code above resolves the issue - in my small example application and in my production project.

In my production project, before the fix, I had ~ 4000 errors during deploy and, after the fix, none of them appear.

Thanks again for suggestion and the fix.

meltsufin commented 7 months ago

Thanks for confirming @ernestaskardzys. Re-opening to track a permanent fix.

andrewpolemeni commented 6 months ago

@meltsufin So we are also getting the same issue with the pub-sub template:

java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@48413d51[Not completed, task = java.util.concurrent.Executors$RunnableAdapter@4897d424[Wrapped task = com.google.cloud.pubsub.v1.Publisher$1@1109ae03]] rejected from org.springframework.scheduling.concurrent.ThreadPoolTaskScheduler$1@1330d728[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 755149]
    at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown Source) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source) ~[na:na]
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(Unknown Source) ~[na:na]
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.schedule(Unknown Source) ~[na:na]
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.execute(Unknown Source) ~[na:na]
    at com.google.cloud.pubsub.v1.Publisher.publish(Publisher.java:316) ~[google-cloud-pubsub-1.125.13.jar:1.125.13]
    at com.google.cloud.spring.pubsub.core.publisher.PubSubPublisherTemplate.publish(PubSubPublisherTemplate.java:94) ~[spring-cloud-gcp-pubsub-4.9.0.jar:4.9.0]
    at com.google.cloud.spring.pubsub.core.publisher.PubSubPublisherTemplate.publish(PubSubPublisherTemplate.java:80) ~[spring-cloud-gcp-pubsub-4.9.0.jar:4.9.0]
    at com.google.cloud.spring.pubsub.core.PubSubTemplate.publish(PubSubTemplate.java:115) ~[spring-cloud-gcp-pubsub-4.9.0.jar:4.9.0]
    at java.base/java.util.Optional.ifPresent(Unknown Source) ~[na:na]
set('springBootVersion', "3.2.4")
set('springCloudVersion', "2023.0.0")
set('springCloudGcpVersion', "4.9.0")

I tried upgrading to springCloudGcpVersion 5.1.0 or 5.2.0 and adding the property spring.cloud.gcp.pubsub.publisher.executor-accept-tasks-after-context-close=true but that caused our application to crash with an error log:


Error creating bean with name 'entityManagerFactory' defined in class path resource [org/springframework/boot/autoconfigure/orm/jpa/HibernateJpaConfiguration.class]: 'void com.google.cloud.sql.core.CoreSocketFactory.addArtifactId(java.lang.String)'

Any help would be much appreciated.

meltsufin commented 6 months ago

@andrewpolemeni Can you please file this as a separate issue? If you can, also provide a sample that reproduces the issue. Thanks!