aws / aws-sdk-cpp

AWS SDK for C++
Apache License 2.0
1.98k stars 1.06k forks source link

fix potential deadlock on shutdown #2879

Closed sbiscigl closed 8 months ago

sbiscigl commented 8 months ago

Description of changes:

A customer was seeing a issue where ExclusiveOwnershipResourceManager would see a deadlock on shutdown. specifically in ShutdownAndWait which currently looks like

Aws::Vector<RESOURCE_TYPE> ShutdownAndWait(size_t resourceCount) {
  std::unique_lock<std::mutex> locker(m_queueLock);
  m_shutdown = true;
  ... drain all resources
}

and issue could arise if in Acquire

RESOURCE_TYPE Acquire() {
  std::unique_lock<std::mutex> locker(m_queueLock);
  while(!m_shutdown.load() && m_resources.size() == 0) {
    m_semaphore.wait(locker, [&](){ return m_shutdown.load() || m_resources.size() > 0; });
  }
  ...
}

where m_resources never dips to zero making it so that the queue lock is never released which would never allow for ShutdownAndWait to run. by swapping ShutdownAndWait to be

Aws::Vector<RESOURCE_TYPE> ShutdownAndWait(size_t resourceCount) {
  m_shutdown = true;
  std::unique_lock<std::mutex> locker(m_queueLock);
  ... drain all resources
}

we actually short circuit that while loops -- as intended. Also m_shutdown is atomic and can live outside of the lock.

Check all that applies:

Check which platforms you have built SDK on to verify the correctness of this PR.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.