awspring / spring-cloud-aws

The New Home for Spring Cloud AWS
http://awspring.io
Apache License 2.0
880 stars 299 forks source link

Unable to create native thread: possibly out of memory or process/resource limits reached #470

Closed shresthaujjwal closed 2 years ago

shresthaujjwal commented 2 years ago

Type: Bug

Component:

SQS I am using Spring Cloud AWS version 2.4.0 + Spring Boot version 2.7.0 Java Version

Java version: 11.0.15, vendor: Amazon.com Inc., runtime: /usr/lib/jvm/java-11-amazon-corretto
Default locale: en_US, platform encoding: ANSI_X3.4-1968
OS name: "linux", version: "5.10.126-117.518.amzn2.x86_64", arch: "amd64", family: "unix"

Describe the bug Please provide details of the problem, including the version of Spring Cloud that you are using. I started Q&A thread about how to control the thread but i feel this is a bug. As stated in the discussion thread Limit SQS Concurrency and/or control thread exhaustion, i have a simple microservice that consumes messages. Here is my setup

I thought i have too many container so i decided to reduce that to just 1 container with 4GB (same instance size), even after doing that i ran into thread issue.

Caused by: org.springframework.messaging.MessageHandlingException: Unexpected handler method invocation error; nested exception is java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
    at org.springframework.messaging.handler.invocation.AbstractMethodMessageHandler.handleMatch(AbstractMethodMessageHandler.java:588)
    ... 8 common frames omitted
Caused by: java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached

Followed by this error

[1952.853s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
[1952.937s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.

I ran ps -elfT | wc -l and noticed thread count keep growing and doesn't come down even after message is processed.

Sample Sample Code

@VisibleForTesting
  @SqsListener(value = "${app.cloud.aws.sqs.queueName}", deletionPolicy = SqsMessageDeletionPolicy.NEVER)
  void listenToRequestResponse(@Header(name="MessageId") String messageId,
                                           RequestResponse RequestResponse,
                                           Acknowledgment acknowledgment) {
    // Listen for a message from the queue and using the info
    try {
      //Process message .....
      acknowledgment.acknowledge().get();
      log.info("End messageId: {}", messageId);
    } catch (Exception e) {
      log.error("Failed to process: {}", e);
    }
  }

I also have this converter registered as bean to handle payload and a bean to handle proxy.

@Configuration
public class AWSSQSConfiguration {

  //Handles the SQS datetime conversion
  @Bean
  public ObjectMapper objectMapper() {
    return JsonMapper.builder()
      .configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES,
        false)
      .addModule(new JavaTimeModule())
      .build();
  }

  //Handles SNS Message in SQS
  @Bean
  public QueueMessageHandlerFactory queueMessageHandlerFactory(Optional<AmazonSQSAsync> amazonSQSAsync, BeanFactory beanFactory) {
    if (amazonSQSAsync.isPresent()) {
      ObjectMapper objectMapper = new ObjectMapper();
      MappingJackson2MessageConverter jacksonMessageConverter = new MappingJackson2MessageConverter();
      jacksonMessageConverter.setSerializedPayloadClass(String.class);
      jacksonMessageConverter.setObjectMapper(objectMapper);
      jacksonMessageConverter.setStrictContentTypeMatch(false);

      List<MessageConverter> payloadArgumentConverters = new ArrayList<>();
      payloadArgumentConverters.add(jacksonMessageConverter);

      // Converter that is invoked on SNS messages on SQS listener
      NotificationRequestConverter notificationRequestConverter = new NotificationRequestConverter(jacksonMessageConverter);

      payloadArgumentConverters.add(notificationRequestConverter);
      payloadArgumentConverters.add(new SimpleMessageConverter());
      CompositeMessageConverter compositeMessageConverter = new CompositeMessageConverter(payloadArgumentConverters);

      Assert.notNull(amazonSQSAsync.get(), "AmazonSQSAsync cannot be null");
      Assert.notNull(beanFactory, "BeanFactory cannot be null");
      QueueMessageHandlerFactory factory = new QueueMessageHandlerFactory();
      factory.setAmazonSqs(amazonSQSAsync.get());
      factory.setBeanFactory(beanFactory);
      factory.setArgumentResolvers(Collections.singletonList(new NotificationMessageArgumentResolver(compositeMessageConverter)));
      return factory;
    } else {
      return null;
    }
  }

  @Bean(name = GLOBAL_CLIENT_CONFIGURATION_BEAN_NAME)
  public ClientConfiguration clientConfiguration() {
    ClientConfiguration clientConfiguration = new ClientConfiguration();
    clientConfiguration.setProxyProtocol(protocol);
    clientConfiguration.setProxyHost(host);
    clientConfiguration.setProxyPort(port);
    return clientConfiguration;
  }
}

Please advise.

shresthaujjwal commented 2 years ago

Another update. I limited the MaxNumberOfMessages consumption to 5 but even that didn't help. It ran little longer but it still ranout of memory at the end

@Bean
  public SimpleMessageListenerContainerFactory simpleMessageListenerContainerFactory(Optional<AmazonSQSAsync> amazonSQSAsync) {
    if (amazonSQSAsync.isPresent()) {
      SimpleMessageListenerContainerFactory factory = new SimpleMessageListenerContainerFactory();
      factory.setAmazonSqs(amazonSQSAsync.get());
      factory.setAutoStartup(true);
      factory.setMaxNumberOfMessages(5);
      factory.setWaitTimeOut(10);
      factory.setBackOffTime(Long.valueOf(60000));
      return factory;
    } else {
      return null;
    }
  }
shresthaujjwal commented 2 years ago

closing this issue, as it turns out my application had a memory leak. Apologize for inconvenience

maciejwalkowiak commented 2 years ago

Thanks for update @shresthaujjwal