Clearing ConnectionPool after fork

Metadata Version: 2.3.x Platforms: RHEL5, Amazon Linux

Issue: When forking on linux based systems race conditions can arise between a parent process, child processes, and long-lived connections in Seahorse::Client::NetHttp::ConnectionPool. If a connection exists in the pool before a parent forks the child will also have access to this file descriptor and can possibly attempt to send a request on the connection at the same time another process is listening for a response.

SQS Example:

Parent process long polls Aws::SQS::Client#receive_messages and receives a batch of messages
Parent forks a child process for each message 3a. Parent process long polls Aws::SQS::Client#receive_messages 3b. Child process finishes processing and calls Aws::SQS::Message.delete

It is possible for 3a and 3b to utilize the same open file descriptor available in the Seahorse::Client::NetHttp::ConnectionPool and cause some unrelated error to be thrown.

Creating new instances of a client in the child process will not clear the ConnectionPool since all of the pools are stored in a class variable.

I created an example script that showcases this issue and it is based on a similar application that my team uses in production to process async jobs.

I was able to fix this by adding the following code after any fork:

Seahorse::Client::NetHttp::ConnectionPool.pools.each do |pool|
  pool.empty!
end

As far as I can tell from the public documentation consumers are discouraged from relying on implementation details inside of Seahorse.

Feature Request: Add a public API for clearing connection pools (Aws.empty_connection_pools!?)

Documentation Request: Add documentation around fork-ing best practices with respect to the Aws SDK

amazon-archives / aws-sdk-core-ruby

Clearing ConnectionPool after fork #225