Closed hunjmes closed 1 month ago
Also hangs in release 1.11.328 .
Looks like function CrtClientShutdownCallback()
is not being called, to Release()
the S3CrtClient
's m_clientShutdownSem
semaphore field.
Problem seems to be that if the caller creates and destroys an S3CrtClient
, using RAII pattern, then the constructor creates a semaphore field:
m_clientShutdownSem = Aws::MakeShared<Threading::Semaphore>(ALLOCATION_TAG, 0, 1);
-- and passes that semaphore to the wrapped Crt Client:
m_wrappedData.clientShutdownSem = m_clientShutdownSem;
s3CrtConfig.shutdown_callback = CrtClientShutdownCallback;
s3CrtConfig.shutdown_callback_user_data = static_cast<void*>(&m_wrappedData);
m_s3CrtClient = aws_s3_client_new(Aws::get_aws_allocator(), &s3CrtConfig);
Then, inside the destructor:
aws_s3_client_release(m_s3CrtClient);
if(m_clientShutdownSem)
{
m_clientShutdownSem->WaitOne(); // Wait aws_s3_client shutdown
}
However, the call to aws_s3_client_release(...)
doesn't seem to be releasing the Crt client.
I wonder if there's an implicit assumption, at the Crt layer, that there is always only 1 S3 (crt) client alive, at a time? What's supposed to happen, if the caller does:
{
ClientConfiguration config;
S3CrtClient client1(config);
S3CrtClient client2(config);
...
}
?
My hanging/failing test is a Google-test "crash test," and the hanging process is the process fork()
ed by the Google-test framework. My other unit tests, that don't use the "crash test" framework, do not appear to hang.
I wonder if some "Crt" state that used to be tied to a given S3CrtClient
object has now been made static
?
A quick note is that forking can lead to somewhat unexpected results currently when used in conjunction with CRT. CRT does rely on some global state that can end up being mangled during fork. This is something on the backlog for the team to improve, but we dont have a concrete ETA for that. As a workaround, you could try to avoid forking while CPP SDK is in initialized state, i.e. explicitly calling ShutdownAPI before forking the process.
Confirmed that the proposed workaround, to call ShutdownAPI before forking, works. Thanks!
@hunjmes Did you have any other questions about this sdk?
Describe the bug
Symptoms are superficially similar to https://github.com/aws/aws-sdk-cpp/issues/2769 , except that issue involved calling
Aws::ShutdownAPI()
, while anS3CrtClient
was still alive.This bug involves the
S3CrtClient::~S3CrtClient()
destructor, itself. We see a hang/deadlock inside:Expected Behavior
The
S3CrtClient
's destructor should not hang. (This was the behavior, as of release 1.11.175.)Current Behavior
The
S3CrtClient
's destructor hangs, as of release 1.11.195.Reproduction Steps
We don't have a formal repro yet. I am trying to nail down the root cause of the hang -- it might be unexpected/incorrect usage of the AWS API, by our client application -- concurrently with filing this bug.
Possible Solution
No response
Additional Information/Context
No response
AWS CPP SDK version used
1.11.195
Compiler and Version used
gcc 11
Operating System and version
Linux 5.10.216-182.855.amzn2int.x86_64