aws / aws-sdk-cpp

AWS SDK for C++
Apache License 2.0
1.95k stars 1.05k forks source link

[aws-cpp-sdk-core] shut down CRT logging after shutting down CRT #1995

Closed grrtrr closed 1 year ago

grrtrr commented 2 years ago

Describe the bug

The ShutdownAPI call in source/Aws.cpp first calls Aws::Utils::Logging::ShutdownCRTLogging() (if logging was enabled), and then Aws::CleanupCrt().

This causes issues on shutdown of the CRT subsystem - threads that are in the process of being shut down may still emit logging messages to the (already shut down) CRTLogSystemInterface, leading to undefined behaviour and a race condition (please see below).

Expected Behavior

ShutdownAPI cleanly shuts down all involved components.

Current Behavior

When running test using the clang TSAN analyzer, one can observe the above described race conditions:

WARNING: ThreadSanitizer: data race (pid=9800)
  Write of size 8 at 0x7b1000000e20 by main thread:
    #0 free ??:? (evolved_log_reader_integration_test+0x5c518)
    #1 std::_Sp_counted_ptr_inplace<Aws::Utils::Logging::DefaultCRTLogSystem, std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem>, (__gnu_cxx::_Lock_policy)2>::_M_destroy() ??:? (libcloud_Saws_Slibaws_Uinit
.so+0x896e)
    #2 Aws::Utils::Logging::ShutdownCRTLogging() ??:? (libexternal_Scom_Ugithub_Uaws-sdk-cpp_Slibaws-core.so+0x1675b2)
    #3 Aws::ShutdownAPI(Aws::SDKOptions const&) ??:? (libexternal_Scom_Ugithub_Uaws-sdk-cpp_Slibaws-core.so+0x7dbe8)
    #4 av::cloud::aws::AWSInit::~AWSInit() ??:? (libcloud_Saws_Slibaws_Uinit.so+0x6f62)
    #5 cxa_at_exit_wrapper(void*) tsan_interceptors_posix.cpp:? (our_integration_test+0x9a46f)

  Previous read of size 8 at 0x7b1000000e20 by thread T2:
    #0 s_main_loop epoll_event_loop.c:? (libexternal_Saws-c-io_Slibaws-c-io.so+0x1d84a)
    #1 thread_fn thread.c:? (libexternal_Saws-c-common_Slibaws-c-common.so+0x2b680)

  Thread T2 (tid=9812, running) created by main thread at:
    #0 pthread_create ??:? (our_integration_test+0x5d72b)
    #1 aws_thread_launch ??:? (libexternal_Saws-c-common_Slibaws-c-common.so+0x2b3f9)
    #2 s_run epoll_event_loop.c:? (libexternal_Saws-c-io_Slibaws-c-io.so+0x1cf82)
    #3 s_event_loop_group_new event_loop.c:? (libexternal_Saws-c-io_Slibaws-c-io.so+0x1406e)
    #4 aws_event_loop_group_new_default ??:? (libexternal_Saws-c-io_Slibaws-c-io.so+0x143d7)
    #5 Aws::Crt::Io::EventLoopGroup::EventLoopGroup(unsigned short, aws_allocator*) ??:? (libexternal_Saws-crt_Slibaws-crt.so+0x4cf8b)
    #6 std::_Function_handler<std::shared_ptr<Aws::Crt::Io::ClientBootstrap> (), av::cloud::aws::AWSInit::AWSInit()::$_2>::_M_invoke(std::_Any_data const&) aws_init.cc:? (libcloud_Saws_Slibaws_Uinit.so+0x74d5)
    #7 Aws::InitAPI(Aws::SDKOptions const&) ??:? (libexternal_Scom_Ugithub_Uaws-sdk-cpp_Slibaws-core.so+0x7c4a8)
    #8 av::cloud::aws::AWSInit::AWSInit() ??:? (libcloud_Saws_Slibaws_Uinit.so+0x646e)

...
WARNING: ThreadSanitizer: data race (pid=9800)
  Write of size 8 at 0x7b1000000e30 by main thread:
    #0 free ??:? (our_integration_test+0x5c518)
    #1 std::_Sp_counted_ptr_inplace<Aws::Utils::Logging::DefaultCRTLogSystem, std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem>, (__gnu_cxx::_Lock_policy)2>::_M_destroy() ??:? (libcloud_Saws_Slibaws_Uinit
.so+0x896e)
    #2 Aws::Utils::Logging::ShutdownCRTLogging() ??:? (libexternal_Scom_Ugithub_Uaws-sdk-cpp_Slibaws-core.so+0x1675b2)
    #3 Aws::ShutdownAPI(Aws::SDKOptions const&) ??:? (libexternal_Scom_Ugithub_Uaws-sdk-cpp_Slibaws-core.so+0x7dbe8)
    #4 av::cloud::aws::AWSInit::~AWSInit() ??:? (libcloud_Saws_Slibaws_Uinit.so+0x6f62)
    #5 cxa_at_exit_wrapper(void*) tsan_interceptors_posix.cpp:? (evolved_log_reader_integration_test+0x9a46f)

  Previous read of size 8 at 0x7b1000000e30 by thread T9:
    #0 Aws::Utils::Logging::s_aws_logger_redirect_get_log_level(aws_logger*, unsigned int) CRTLogSystem.cpp:? (libexternal_Scom_Ugithub_Uaws-sdk-cpp_Slibaws-core.so+0x16726b)
    #1 s_main_loop epoll_event_loop.c:? (libexternal_Saws-c-io_Slibaws-c-io.so+0x1d862)
    #2 thread_fn thread.c:? (libexternal_Saws-c-common_Slibaws-c-common.so+0x2b680)

  Thread T9 (tid=9819, running) created by main thread at:
    #0 pthread_create ??:? (evolved_log_reader_integration_test+0x5d72b)
    #1 aws_thread_launch ??:? (libexternal_Saws-c-common_Slibaws-c-common.so+0x2b3f9)
    #2 s_run epoll_event_loop.c:? (libexternal_Saws-c-io_Slibaws-c-io.so+0x1cf82)
    #3 s_event_loop_group_new event_loop.c:? (libexternal_Saws-c-io_Slibaws-c-io.so+0x1406e)
    #4 aws_event_loop_group_new_default ??:? (libexternal_Saws-c-io_Slibaws-c-io.so+0x143d7)
    #5 Aws::Crt::Io::EventLoopGroup::EventLoopGroup(unsigned short, aws_allocator*) ??:? (libexternal_Saws-crt_Slibaws-crt.so+0x4cf8b)
    #6 std::_Function_handler<std::shared_ptr<Aws::Crt::Io::ClientBootstrap> (), av::cloud::aws::AWSInit::AWSInit()::$_2>::_M_invoke(std::_Any_data const&) aws_init.cc:? (libcloud_Saws_Slibaws_Uinit.so+0x74d5)
    #7 Aws::InitAPI(Aws::SDKOptions const&) ??:? (libexternal_Scom_Ugithub_Uaws-sdk-cpp_Slibaws-core.so+0x7c4a8)

Reproduction Steps

Run programs / integration test under clang TSAN analyzer.

Possible Solution

Shut down the CRT subsystem first, before shutting down its logging system.

Additional Information/Context

No response

AWS CPP SDK version used

1.9.x (1.9.170, but problem also on master).

Compiler and Version used

clang11

Operating System and version

Linux, ubuntu 18.04

jmklix commented 2 years ago

Thanks for finding this taking the time to make a PR. I will try to get the reviewed and merged

jmklix commented 2 years ago

This doesn't pass some of our tests required before merging. Does this effect your program in any way or do you only find it when running TSAN analyzer?

grrtrr commented 2 years ago

This doesn't pass some of our tests required before merging. Does this effect your program in any way or do you only find it when running TSAN analyzer?

Do you mean #1996, and what kind of tests are failing?

The logging system needs to be shut down after the components that use it, otherwise segmentation faults will result - regardless of TSAN or not. We have had these in production.

github-actions[bot] commented 1 year ago

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.