Bootstrap discovery (locating config server via discovery) leads to harmless RejectedExecutionException in logs

codependent commented 8 years ago

My microservices' logs show periodically, can't tell when, just at a certain moment the following exception:

java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@7a98c42 rejected from java.util.concurrent.ThreadPoolExecutor@96e72c7[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
    at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown Source) ~[na:1.8.0_74]
    at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source) [na:1.8.0_74]
    at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source) [na:1.8.0_74]
    at java.util.concurrent.AbstractExecutorService.submit(Unknown Source) ~[na:1.8.0_74]
    at com.netflix.discovery.TimedSupervisorTask.run(TimedSupervisorTask.java:62) ~[eureka-client-1.4.6.jar:1.4.6]
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [na:1.8.0_74]
    at java.util.concurrent.FutureTask.run(Unknown Source) [na:1.8.0_74]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) [na:1.8.0_74]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) [na:1.8.0_74]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.8.0_74]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [na:1.8.0_74]
    at java.lang.Thread.run(Unknown Source) [na:1.8.0_74]

My infraestructure has two Eureka peer aware servers, with the following configuration:


---
spring:
  profiles: archimedes1

server:
  port: ${PORT:8761}

eureka:
  instance:
    hostname: archimedes1
  client:
    serviceUrl:
      defaultZone: http://archimedes2:8762/eureka/

---
spring:
  profiles: archimedes2

server:
  port: ${PORT:8762}

eureka:
  instance:
    hostname: archimedes2
  client:
    serviceUrl:
      defaultZone: http://archimedes1:8761/eureka/

All my services, at some moment throw that exception. This is and example of one of them:


---
spring:
  application:
    name: user-api
  profiles: user-api1
  cloud:
    config:
      discovery:
        enabled: true
      failFast: true
      retry:
        maxAttempts: 100
        initialInterval: 2000
        maxInterval: 3000
        multiplier: 1.5

server:
  port: 9977

eureka:
  client:
    serviceUrl:
      defaultZone: http://archimedes1:8761/eureka/,http://archimedes2:8762/eureka/
    healthcheck:
      enabled: true

Despite the exception everything keeps running OK, but it's kind of annoying to get it. If you think this should be a Netflix's Eureka issue let me know and I'll close this one.

daniellavoie commented 8 years ago

I built a setup with 2 peer aware eureka servers and one eureka client with the configuration you provided. I am not reproducing your problem. Can you upload to GH a code snippet reproducing this problem ?

daniellavoie commented 8 years ago

I am also concerned by your stacktrace.

ThreadPoolExecutor@96e72c7[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]

It feels to me that you have overwritten eureka thread pool configuration. With this setup, any task will fail since your pool size are defined to 0 according to your stack trace.

codependent commented 8 years ago

@daniellavoie I'll prepare a sample project along the day.

codependent commented 8 years ago

I have finally prepared the sample projects: https://github.com/codependent/scn-brixton-release.git

Actually it has nothing to do with using peer Eureka servers, the miminum scenario needed is one Eureka Server, one config server and one microservice (Eureka client that uses the config server).

When I remove the config server the error doesn't show up.

To reproduce just start the Eureka server (archimedes), when it's up, start the Config server, and when everything is up just launch the user-api server. You'll see the usual:

2016-05-31 09:14:36.680   INFO 12192 --- [           main] c.codependent.sample.UserApiApplication  : Started UserApiApplication in 8.118 seconds (JVM running for 8.732)

Then after 5 minutes the error shows up in the log:

2016-05-31 09:19:31.085 ERROR 12192 --- [ver-bootstrap-0] c.netflix.discovery.TimedSupervisorTask  : task supervisor rejected the task

java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@19e8a47a rejected from java.util.concurrent.ThreadPoolExecutor@172f5937[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
    at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown Source) ~[na:1.8.0_74]
    at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source) [na:1.8.0_74]
    at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source) [na:1.8.0_74]
    at java.util.concurrent.AbstractExecutorService.submit(Unknown Source) ~[na:1.8.0_74]
    at com.netflix.discovery.TimedSupervisorTask.run(TimedSupervisorTask.java:62) ~[eureka-client-1.4.6.jar:1.4.6]
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [na:1.8.0_74]
    at java.util.concurrent.FutureTask.run(Unknown Source) [na:1.8.0_74]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) [na:1.8.0_74]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) [na:1.8.0_74]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.8.0_74]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [na:1.8.0_74]
    at java.lang.Thread.run(Unknown Source) [na:1.8.0_74]

2016-05-31 09:19:36.365  INFO 12192 --- [trap-executor-0] c.n.d.s.r.aws.ConfigClusterResolver      : Resolving eureka endpoints via configuration

royremi commented 8 years ago

Maybe i'm wrong, but i've seen this error when you have the devtools on the classpath.

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-devtools</artifactId>
            <optional>true</optional>
        </dependency>

when I recompile in intellij a lot of my apps show this error and I have to restart them manually. Idk if it's can be you're case!

codependent commented 8 years ago

@royremi I just checked the dependencies. I don't have it, not even as a transitive dependency.

daniellavoie commented 8 years ago

Great, got to reproduce it. I'll debug to see what is wrong with that executor.

daniellavoie commented 8 years ago

I won't have too much time for the rest of the week but here is what I found for now.

There is two discovery client running. Spring Cloud AutoConfiguration sends a termination signals to the EurekaClient located in the parent spring context. This signal terminates the executor service who receives a new task from TimedSupervisorTask. Since the EurekaClient was previously shutdown, the tasks fails and generate the stack trace. I haven't figured out why the TimedSupervisorTask is generated for that terminated client. I'm not sure if it is normal to have 2 discovery clients running. One comes from the bootstrap mechanism.

asarkar commented 8 years ago

I was on Brixton.M3 and there were exceptions reported as in https://github.com/Netflix/eureka/issues/638. Now I upgraded but have a different exception reported in this issue. Given the simple reproduction of the latest exception, I'm wondering how did any test app manage to not throw this exception during the GA release of Brixton. What does the Spring Cloud team do differently (for integration testing) that we may be missing?

cs94njw commented 8 years ago

I accidentally had this in my Microservice's .yml file:

eureka:
  instance:
     prefer-ip-address: true
     leaseRenewalIntervalInSeconds: 30   # DO NOT DO THIS IN PRODUCTION

After reading this bug report, I wondered if perhaps "eureka->instance" variables shouldn't exist in a Non-Eureka project. So I moved that configuration to my Eureka project, and it seems to be happy.

EDIT: I take that back. The error still occurs for me.

trendsoa commented 8 years ago

same error when I change my service to use spring cloud config discovery in bootstrap.yml

spring:
  application:
    name: store-service
  cloud:
    config:
      discovery:
        enabled: true
        service-id: CONFIG-SERVER

in my service configuration, I also use EnableDiscoveryClient annotation

looks like some of the logs are duplicate

2016-08-01 10:50:25.578  INFO 10392 --- [           main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 8086 (http)
2016-08-01 10:50:25.578  INFO 10392 --- [           main] c.n.e.EurekaDiscoveryClientConfiguration : Updating port to 8086
2016-08-01 10:50:25.594  INFO 10392 --- [           main] c.n.e.EurekaDiscoveryClientConfiguration : Updating port to 8086

tjuchniewicz commented 8 years ago

My "solution" for this error:

logging:
  level:
    com.netflix.discovery.TimedSupervisorTask: "OFF"

asarkar commented 8 years ago

You call that a solution? That's like doctor telling a patient not to worry because the MRI machine isn't working.

spencergibb commented 8 years ago

@abhijitsarkar please be aware of your tone. That last comment was not necessary. So far, besides the error showing in the log, no one has demonstrated that it is negatively affecting their applications.

asarkar commented 8 years ago

My tone was humorous with a hint of sarcasm. Besides, this is an open forum where people are discussing issues about a software. If you expect everyone to always speak in a manner that's pleasing to you, consider moving to a closed forum like Spring JIRA with only developer access.

spencergibb commented 8 years ago

Tone and sarcasm don't always come across really well in short text. I wasn't the only one who reacted like I did. We have a Code of Conduct we expect folks to abide by.

spencergibb commented 8 years ago

Pleasing to me vs. respectful to all is very different.

asarkar commented 8 years ago

My analogy to the doctor in response to a suggestion of suppressing error logs was disrespectful? And violates the code of conduct? Sorry, I just don't get it.

The last relevant comment from a committer was 3 months ago saying that they won't have time to look into this issue. And from your comment above, it seems like you're waiting for the users to prove that they are negatively impacted before you do anything about it. Of course, this is a community driven open source but don't you think ludicrous log suppression isn't the answer?

lferna commented 7 years ago

How to reproduce the error:

With this Eureka configuration:

eureka.instance.hostname=localhost
eureka.instance.port=8761

spring.application.name=eureka
spring.cloud.config.enabled=false

server.port=${eureka.instance.port}

eureka.server.enableSelfPreservation=false

eureka.client.registerWithEureka=false
eureka.client.fetchRegistry=false
eureka.client.serviceUrl.defaultZone=http://${eureka.instance.hostname}:${eureka.instance.port}/eureka/

This POM configuration;

  <dependencyManagement>
    <dependencies>
      <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-parent</artifactId>
        **<version>Camden.SR4</version>**
        <type>pom</type>
        <scope>import</scope>
      </dependency>
    </dependencies>
  </dependencyManagement>

   <dependencies>
    <dependency>
      <groupId>org.springframework.cloud</groupId>
      <artifactId>spring-cloud-starter-eureka-server</artifactId>
    </dependency>
  </dependencies>

This configuration client:

spring.cloud.config.enabled=true
spring.application.name=<name>
#Uncomment service discovery or URI
spring.cloud.config.discovery.enabled=true
spring.cloud.config.uri=<URI>

The error:

2017-06-19 13:40:57.841 ERROR [mscustomer,,,] 15516 --- [ver-bootstrap-0] c.netflix.discovery.TimedSupervisorTask  : task supervisor rejected the task

java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@652f1860 rejected from java.util.concurrent.ThreadPoolExecutor@6be8eab5[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
    at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047) ~[na:1.8.0_101]

Versions:

Spring Boot version: 1.4.0
Spring Cloud Netflix: Camdem.SR4
Spring Cloud Config: 1.2.3.RELEASE (**the error only reproduces with Spring Cloud Config**. If I use local configuration, the error doesn't show)

Upgrade the Netflix version, shows the log in warn level, is that the solution for this problem?

Thanks.

dsyer commented 7 years ago

Thanks. Please create a sample project and link to it. And please format your code snippets. How did you upgrade the netflix version? What was the outcome?

dsyer commented 7 years ago

FWIW I updated the app in the @codependent sample above to use Dalston.SR1 and I don't see any exceptions logged at all (after running for about 10min). 5min seems to be the pattern with the original sample (Brixton.RELEASE). I propose we close this as fixed.

lferna commented 7 years ago

Hi, thanks for your reply. I've tested the @codependent and with the original dependencies (Brixton) the error shows after 5 minutes. With Camdem.SR4 the error shows again. And with Dalston.SR1 the error shows again.

java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@6be81451 rejected from java.util.concurrent.ThreadPoolExecutor@652fe75c[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]

I've updated this dependencies:

    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
          <version>1.4.0.RELEASE</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.springframework.cloud</groupId>
                <artifactId>spring-cloud-dependencies</artifactId>
                <version>Dalston.SR1</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

spencergibb commented 7 years ago

Dalston is not compatible with boot 1.4.x, btw.

lferna commented 7 years ago

You're right, but with the 1.5.x the error continues :(

dsyer commented 7 years ago

Not for me. What log level is it coming out at? What did you do to see that log? You didn't even paste the whole stack trace BTW.

codependent commented 7 years ago

I just updated the sample proyect to work with Spring Boot 1.5.4.RELEASE and Spring Cloud Dalston.SR1. With this setup the exception no longer shows up.

Thank you for having a look at it.

lferna commented 7 years ago

Hi @codependent , with Spring Boot 1.5.4.RELEASE the exception dissapear, thanks.

mradulpandey commented 7 years ago

I observed if i set eureka.client.enabled=true in my config server and hit http://localhost:8888/monitor?path=application i get the following response

{
    "timestamp": 1503601813361,
    "status": 500,
    "error": "Internal Server Error",
    "exception": "java.util.concurrent.RejectedExecutionException",
    "message": "Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@7f865c88 rejected from java.util.concurrent.ScheduledThreadPoolExecutor@20ac99dd[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2]",
    "path": "/monitor"
}

Versions:

   spring.boot.version --> 1.5.6.RELEASE
   spring.cloud.version --> Dalston.SR1, Dalston.SR2 and Dalston.SR3

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-config-server</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-config-monitor</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-stream-rabbit</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-eureka</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-actuator-docs</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-security</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.security</groupId>
    <artifactId>spring-security-rsa</artifactId>
</dependency>

<dependency>
    <groupId>org.jolokia</groupId>
    <artifactId>jolokia-core</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-test</artifactId>
    <scope>test</scope>
</dependency>

dsyer commented 7 years ago

@pandeymradul this issue is closed and yours is unrelated. Please open another ticket.

mradulpandey commented 7 years ago

@dsyer i created issue https://github.com/spring-cloud/spring-cloud-netflix/issues/2228

spring-cloud / spring-cloud-netflix

Bootstrap discovery (locating config server via discovery) leads to harmless RejectedExecutionException in logs #1055