aws / aws-sdk-cpp

AWS SDK for C++
Apache License 2.0
1.98k stars 1.06k forks source link

TSAN thread leak in aws_thread_launch #3130

Closed jweinst1 closed 1 month ago

jweinst1 commented 1 month ago

Describe the bug

Our sanitizers are picking up a leaked thread when simply initializing the SDK / API,

747: ==================
747: WARNING: ThreadSanitizer: thread leak (pid=52100)
747:   Thread T29 'EvntLoopCleanup' (tid=52340, finished) created by main thread at:
747:     #0 pthread_create <null> (framework-tester+0x1df764b) (BuildId: 0111f2a0878289b0)
747:     #1 aws_thread_launch /builds/splcore/main/contrib/aws-sdk-cpp-1.11.266/crt/aws-crt-cpp/crt/aws-c-common/source/posix/thread.c:333:19 (framework-tester+0x3e1bc53) (BuildId: 0111f2a0878289b0)
747:     #2 __static_initialization_and_destruction_0 /builds/splcore/main/contrib/aws-sdk-cpp-1.11.266/src/aws-cpp-sdk-core/source/Globals.cpp:19:59 (framework-tester+0x3cb332d) (BuildId: 0111f2a0878289b0)
747:     #3 _GLOBAL__sub_I_Globals.cpp /builds/splcore/main/contrib/aws-sdk-cpp-1.11.266/src/aws-cpp-sdk-core/source/Globals.cpp:77:1 (framework-tester+0x3cb332d)
747:     #4 main /builds/splcore/main/src/framework/tests/FrameworkTester.cpp:88:5 (framework-tester+0x257197b) (BuildId: 0111f2a0878289b0)
747: 
747: SUMMARY: ThreadSanitizer: thread leak (/builds/home/splunk/bin/framework-tester+0x1df764b) (BuildId: 0111f2a0878289b0) in pthread_create
747: ==================
747: ThreadSanitizer: reported 1 warnings

Expected Behavior

No leak will occur

Current Behavior

A thread leak in TSAN is occurring

Reproduction Steps

Build the SDK, statically link it to another binary, build that binary with tsan option, and initialize the SDK

Possible Solution

No response

Additional Information/Context

No response

AWS CPP SDK version used

1.11.266

Compiler and Version used

clang 13

Operating System and version

Linux x86_64

sbiscigl commented 1 month ago

Hey thanks for reaching out, we took a look into it

Reproduction Steps Build the SDK, statically link it to another binary, build that binary with tsan option, and initialize the SDK

Without a minimal reproducible example theres not a lot we can do as theres too many variables with your enviornment and usage. For instance we do not know hwo you are creating and using the SDK in /builds/splcore/main/src/framework/tests/FrameworkTester.cpp:88:5. I tried replicating your example and could not.

Project Structure:

~/sdk-usage-workspace ❯❯❯ tree -L 1
.
├── CMakeLists.txt
├── Dockerfile
├── main.cpp
└── replicate.sh

Dockerfile

# Using offical Amazon Linux 2023 image from public ECR
FROM public.ecr.aws/amazonlinux/amazonlinux:2023

# Install compiler et al.
RUN yum groupinstall "Development Tools" -y

# Install required dependencies
RUN yum install -y curl-devel openssl-devel ninja-build cmake3 libtsan

# Install sdk
RUN git clone --depth 1 --recurse-submodules https://github.com/aws/aws-sdk-cpp && \
    cd aws-sdk-cpp && \
    mkdir build && \
    cd build && \
    cmake -G Ninja -DBUILD_ONLY="core" \
      -DCMAKE_CXX_FLAGS="-fsanitize=thread" \
      -DBUILD_SHARED_LIBS=OFF \
      -DENABLE_ZLIB_REQUEST_COMPRESSION=OFF \
      -DAUTORUN_UNIT_TESTS=OFF .. && \
    cmake --build . && \
    cmake --install .

# Copy over and build
RUN mkdir sdk-example
COPY CMakeLists.txt /sdk-example/CMakeLists.txt
COPY main.cpp /sdk-example/main.cpp
RUN cd sdk-example &&\
    mkdir build &&\
    cd build &&\
    cmake -G Ninja -DBUILD_SHARED_LIBS=OFF \
      -DCMAKE_CXX_FLAGS="-fsanitize=thread" .. && \
    cmake --build .

CMakeLists.txt

cmake_minimum_required(VERSION 3.13)
project(sdk_usage_workspace)
set(CMAKE_CXX_STANDARD 20)
find_package(AWSSDK REQUIRED COMPONENTS core)
add_executable(${PROJECT_NAME} "main.cpp")
target_link_libraries(${PROJECT_NAME} PRIVATE ${AWSSDK_LINK_LIBRARIES})

main.cpp

#include <aws/core/Aws.h>

using namespace Aws;

auto main() -> int
{
    SDKOptions options;
    Aws::InitAPI(options);
    Aws::ShutdownAPI(options);
}

replicate.sh

#!/bin/zsh

set -u

# build image
docker build -t test-image .

# run example
docker run \
    -e AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} \
    -e AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
    -e AWS_SESSION_TOKEN=${AWS_SESSION_TOKEN} \
    --name test-image test-image /sdk-example/build/sdk_usage_workspace

when we run with ./replicated.sh we see the sample application exit successfully

~/sdk-usage-workspace ❯❯❯ docker ps -a
CONTAINER ID   IMAGE        COMMAND                  CREATED              STATUS                      PORTS     NAMES
c799a4a427de   test-image   "/sdk-example/build/…"   About a minute ago   Exited (0) 59 seconds ago             test-image

So it looks to be working as intended in a standalone environment. If you can update this example to replicate your issue we'd be happy to take a look at it.

A possible issue with your code, not sure if youre doing this, is you should not be wrapping the SDK init-shutdown in any static objects. from our basic usage page

The SDK for C++ and its dependencies use C++ static objects, and the order of static object destruction is not determined by the C++ standard. To avoid memory issues caused by the nondeterministic order of static variable destruction, do not wrap the calls to Aws::InitAPI and Aws::ShutdownAPI into another static object.

github-actions[bot] commented 1 month ago

Greetings! It looks like this issue hasn’t been active in longer than a week. We encourage you to check if this is still an issue in the latest release. Because it has been longer than a week since the last update on this, and in the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or add an upvote to prevent automatic closure, or if the issue is already closed, please feel free to open a new one.