Metadata
Version: 2.3.x
Platforms: RHEL5, Amazon Linux
Issue:
When forking on linux based systems race conditions can arise between a parent process, child processes, and long-lived connections in Seahorse::Client::NetHttp::ConnectionPool. If a connection exists in the pool before a parent forks the child will also have access to this file descriptor and can possibly attempt to send a request on the connection at the same time another process is listening for a response.
SQS Example:
Parent process long polls Aws::SQS::Client#receive_messages and receives a batch of messages
Parent forks a child process for each message
3a. Parent process long polls Aws::SQS::Client#receive_messages
3b. Child process finishes processing and calls Aws::SQS::Message.delete
It is possible for 3a and 3b to utilize the same open file descriptor available in the Seahorse::Client::NetHttp::ConnectionPool and cause some unrelated error to be thrown.
Can we move this issue to the aws-sdk-ruby repo? This repo is deprecated. When we move it there, can you let me know which version of the AWS SDK you're using? Thanks!
Metadata Version: 2.3.x Platforms: RHEL5, Amazon Linux
Issue: When forking on linux based systems race conditions can arise between a parent process, child processes, and long-lived connections in
Seahorse::Client::NetHttp::ConnectionPool
. If a connection exists in the pool before a parent forks the child will also have access to this file descriptor and can possibly attempt to send a request on the connection at the same time another process is listening for a response.SQS Example:
Aws::SQS::Client#receive_messages
and receives a batch of messagesAws::SQS::Client#receive_messages
3b. Child process finishes processing and callsAws::SQS::Message.delete
It is possible for 3a and 3b to utilize the same open file descriptor available in the
Seahorse::Client::NetHttp::ConnectionPool
and cause some unrelated error to be thrown.Creating new instances of a client in the child process will not clear the ConnectionPool since all of the pools are stored in a class variable.
I created an example script that showcases this issue and it is based on a similar application that my team uses in production to process async jobs.
I was able to fix this by adding the following code after any fork:
As far as I can tell from the public documentation consumers are discouraged from relying on implementation details inside of Seahorse.
Feature Request: Add a public API for clearing connection pools (
Aws.empty_connection_pools!
?)Documentation Request: Add documentation around
fork
-ing best practices with respect to the Aws SDK