aws / aws-sdk-cpp

AWS SDK for C++
Apache License 2.0
1.94k stars 1.05k forks source link

Aws::Http::CurlHttpClient::MakeRequest cause libcrypto crash #1534

Closed bzhou-sw closed 3 years ago

bzhou-sw commented 3 years ago

Confirm by changing [ ] to [x] below:

Platform/OS/Hardware/Device VM with Linux Debian 4.19.0-8-amd64 SDK release 1.8.63

Describe the question I have an application calling S3 client ListObjectsV2 () periodically and call S3 client GetObject () if any objects in certain time frame. If the interval is 1 minute, then it works fine. If interval is 2 minutes or larger, then first iteration works for both list and get objects. But second iteration crashes. See back trace below.

We use default aws config except: region, proxyHost, proxyPort, and AWS credentials.

Do you have any idea why this happens? Any advice or suggestion are highly appreciated.

Logs/output link information;

 ldd /config/flowlogs-ingestor
        linux-vdso.so.1 (0x00007ffd5d1d9000)
        libaws-cpp-sdk-s3.so => /usr/local/lib/libaws-cpp-sdk-s3.so (0x00007f3f7d5fa000)
        libaws-cpp-sdk-core.so => /usr/local/lib/libaws-cpp-sdk-core.so (0x00007f3f7d2e2000)
        libmemif.so => /usr/local/lib/libmemif.so (0x00007f3f7d0d6000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f3f7ceb9000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f3f7cb30000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f3f7c918000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f3f7c6f9000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f3f7c308000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f3f7dc93000)
        libcurl.so.4 => /usr/lib/x86_64-linux-gnu/libcurl.so.4 (0x00007f3f7c089000)
        libcrypto.so.1.1 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 (0x00007f3f7bbbe000)
        libaws-c-event-stream.so.0unstable => not found
        libaws-c-common.so.0unstable => not found
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f3f7b820000)
        libnghttp2.so.14 => /usr/lib/x86_64-linux-gnu/libnghttp2.so.14 (0x00007f3f7b5fb000)
        libidn2.so.0 => /usr/lib/x86_64-linux-gnu/libidn2.so.0 (0x00007f3f7b3de000)
        librtmp.so.1 => /usr/lib/x86_64-linux-gnu/librtmp.so.1 (0x00007f3f7b1c2000)
        libpsl.so.5 => /usr/lib/x86_64-linux-gnu/libpsl.so.5 (0x00007f3f7afb4000)
        libssl.so.1.1 => /usr/lib/x86_64-linux-gnu/libssl.so.1.1 (0x00007f3f7ad27000)
        libgssapi_krb5.so.2 => /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2 (0x00007f3f7aadc000)
        libldap_r-2.4.so.2 => /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2 (0x00007f3f7a88a000)
        liblber-2.4.so.2 => /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2 (0x00007f3f7a67c000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f3f7a478000)
        libunistring.so.2 => /usr/lib/x86_64-linux-gnu/libunistring.so.2 (0x00007f3f7a0fa000)
        libgnutls.so.30 => /usr/lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007f3f79d95000)
        libhogweed.so.4 => /usr/lib/x86_64-linux-gnu/libhogweed.so.4 (0x00007f3f79b61000)
        libnettle.so.6 => /usr/lib/x86_64-linux-gnu/libnettle.so.6 (0x00007f3f7992b000)
        libgmp.so.10 => /usr/lib/x86_64-linux-gnu/libgmp.so.10 (0x00007f3f796aa000)
        libkrb5.so.3 => /usr/lib/x86_64-linux-gnu/libkrb5.so.3 (0x00007f3f793d4000)
        libk5crypto.so.3 => /usr/lib/x86_64-linux-gnu/libk5crypto.so.3 (0x00007f3f791a2000)
        libcom_err.so.2 => /lib/x86_64-linux-gnu/libcom_err.so.2 (0x00007f3f78f9e000)
        libkrb5support.so.0 => /usr/lib/x86_64-linux-gnu/libkrb5support.so.0 (0x00007f3f78d93000)
        libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f3f78b78000)
        libsasl2.so.2 => /usr/lib/x86_64-linux-gnu/libsasl2.so.2 (0x00007f3f7895d000)
        libgssapi.so.3 => /usr/lib/x86_64-linux-gnu/libgssapi.so.3 (0x00007f3f7871c000)
        libp11-kit.so.0 => /usr/lib/x86_64-linux-gnu/libp11-kit.so.0 (0x00007f3f783ed000)
        libtasn1.so.6 => /usr/lib/x86_64-linux-gnu/libtasn1.so.6 (0x00007f3f781da000)
        libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007f3f77fd6000)
        libheimntlm.so.0 => /usr/lib/x86_64-linux-gnu/libheimntlm.so.0 (0x00007f3f77dcd000)
        libkrb5.so.26 => /usr/lib/x86_64-linux-gnu/libkrb5.so.26 (0x00007f3f77b40000)
        libasn1.so.8 => /usr/lib/x86_64-linux-gnu/libasn1.so.8 (0x00007f3f7789e000)
        libhcrypto.so.4 => /usr/lib/x86_64-linux-gnu/libhcrypto.so.4 (0x00007f3f77668000)
        libroken.so.18 => /usr/lib/x86_64-linux-gnu/libroken.so.18 (0x00007f3f77452000)
        libffi.so.6 => /usr/lib/x86_64-linux-gnu/libffi.so.6 (0x00007f3f7724a000)
        libwind.so.0 => /usr/lib/x86_64-linux-gnu/libwind.so.0 (0x00007f3f77021000)
        libheimbase.so.1 => /usr/lib/x86_64-linux-gnu/libheimbase.so.1 (0x00007f3f76e12000)
        libhx509.so.5 => /usr/lib/x86_64-linux-gnu/libhx509.so.5 (0x00007f3f76bc8000)
        libsqlite3.so.0 => /usr/lib/x86_64-linux-gnu/libsqlite3.so.0 (0x00007f3f768bf000)
        libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007f3f76687000)

gdb crash backtrace:

0x00007ffff6a792c7 in __libc_write (fd=14, buf=0x5555558c0123, nbytes=31) at ../sysdeps/unix/sysv/linux/write.c:27
27      ../sysdeps/unix/sysv/linux/write.c: No such file or directory.
(gdb) bt
#0  0x00007ffff6a792c7 in __libc_write (fd=14, buf=0x5555558c0123, nbytes=31) at ../sysdeps/unix/sysv/linux/write.c:27
#1  0x00007ffff5fdebe5 in ?? () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#2  0x00007ffff5fd9f7a in ?? () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#3  0x00007ffff5fd8fd5 in ?? () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#4  0x00007ffff5fd9473 in BIO_write () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#5  0x00007ffff4c7dcb7 in ?? () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#6  0x00007ffff4c7ebd5 in ?? () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#7  0x00007ffff4c886cc in ?? () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#8  0x00007ffff4c867a5 in ?? () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#9  0x00007ffff4c916bf in SSL_shutdown () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#10 0x00007ffff64543f5 in ?? () from /usr/lib/x86_64-linux-gnu/libcurl.so.4
#11 0x00007ffff6454461 in ?? () from /usr/lib/x86_64-linux-gnu/libcurl.so.4
#12 0x00007ffff64172d9 in ?? () from /usr/lib/x86_64-linux-gnu/libcurl.so.4
#13 0x00007ffff641b4a8 in ?? () from /usr/lib/x86_64-linux-gnu/libcurl.so.4
#14 0x00007ffff642d29b in ?? () from /usr/lib/x86_64-linux-gnu/libcurl.so.4
#15 0x00007ffff642e3c4 in curl_multi_perform () from /usr/lib/x86_64-linux-gnu/libcurl.so.4
#16 0x00007ffff6424234 in curl_easy_perform () from /usr/lib/x86_64-linux-gnu/libcurl.so.4
#17 0x00007ffff770d97d in Aws::Http::CurlHttpClient::MakeRequest(std::shared_ptr<Aws::Http::HttpRequest> const&, Aws::Utils::RateLimits::RateLimiterInterface*, Aws::Utils::RateLimits::RateLimiterInterface*) const ()
   from /usr/local/lib/libaws-cpp-sdk-core.so
#18 0x00007ffff76ce1e2 in Aws::Client::AWSClient::AttemptOneRequest(std::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&, char const*, char const*, char const*) const ()
   from /usr/local/lib/libaws-cpp-sdk-core.so
#19 0x00007ffff76e5efe in Aws::Client::AWSClient::AttemptExhaustively(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const ()
   from /usr/local/lib/libaws-cpp-sdk-core.so
#20 0x00007ffff76e724d in Aws::Client::AWSXMLClient::MakeRequest(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const ()
   from /usr/local/lib/libaws-cpp-sdk-core.so
#21 0x00007ffff7af04cc in Aws::S3::S3Client::ListObjectsV2(Aws::S3::Model::ListObjectsV2Request const&) const () from /usr/local/lib/libaws-cpp-sdk-s3.so
#22 0x000055555555e910 in ingestorS3GetCommonPrefix(std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&, Aws::S3::S3Client const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> >, Aws::Allocator<std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > > >&, long) ()
#23 0x0000555555560bc4 in ingestorS3ListObjects(std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&, Aws::S3::S3Client const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, long) [clone .constprop.240]
    ()
#24 0x0000555555562be0 in ingestor_main_loop() ()
#25 0x000055555555baa3 in main ()

REMEMBER TO SANITIZE YOUR PERSONAL INFO

options.loggingOptions.logLevel = Aws::Utils::Logging::LogLevel::Trace;
Aws::InitAPI(options)
bzhou-sw commented 3 years ago

If I set S3 client's enableTcpKeepAlive to false, then no crashing regardless iteration interval. So this seems some issue inside SDK?

KaibaLopez commented 3 years ago

Hi @bzhou-sw , that's odd, I would expect the opposite to happen with tcpkeepalive.... can you enable the logs for both true and false and post them here? (Remember to censor private info first though).

bzhou-sw commented 3 years ago

@KaibaLopez SDK didn't log anything (even I set log level to TRACE) after first iteration is done, it just silently crashed at the first request of second iteration. Or maybe there was something but not flushed to file? Is there any way to let SDK instantly flush to log file for every message (we are using SDK default logging system)? And what level of log you want?

bzhou-sw commented 3 years ago

The code logic is:

int main(int argc, char *argv[]) {
  Aws::SDKOptions options;
  Aws::Utils::Logging::InitializeAWSLogging(
      Aws::MakeShared<Aws::Utils::Logging::DefaultLogSystem>(
          "flowlogs-test", convert_to_aws_log_level(log_level),
          input_config.cfg_log_file_prefix));
  Aws::InitAPI(options);
  // We are not ec2 instance, so disable EC2MetadataClient
  setenv("AWS_EC2_METADATA_DISABLED", "true", 1);

  // Client config
  Aws::Client::ClientConfiguration config;

  if (input_config.cfg_disable_keepalive) {
    config.enableTcpKeepAlive = 0;
  }
  config.region = input_config.cfg_region;
  config.useDualStack = 1;
  config.proxyHost = input_config.cfg_proxy;
  config.proxyPort = atoi(input_config.cfg_proxy_port);
  Aws::Auth::AWSCredentials credentials;
  credentials.SetAWSAccessKeyId(input_config.cfg_accecc_key_id);
  credentials.SetAWSSecretKey(input_config.cfg_secret_access_key);

  bool useVirtualAddressing = true;
  if (input_config.cfg_aws_force_path_style_access &&
      strcmp(input_config.cfg_aws_force_path_style_access, "true") == 0) {
    AWS_LOG_INFO(__func__, "Setting useVirtualAddressing to false");
    useVirtualAddressing = false;
  }

  Aws::S3::S3Client s3Client(
      credentials, config,
      Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy::Never,
      useVirtualAddressing);

  for (every 5 minutes) {
    Aws::S3::Model::ListObjectsV2Request requestV2;
    requestV2.WithBucket(input_config.cfg_bucket_name);
    requestV2.WithPrefix(someprefix);
    requestV2.SetDelimiter("/");

    AWS_LOG_INFO(__func__, "Calling s3Client.ListObjectsV2 ...\n");
    auto acctOutcome = s3Client.ListObjectsV2(requestV2);
    if (acctOutcome.IsSuccess()) {
      AWS_LOG_INFO(__func__, "s3Client.ListObjectsV2 IsSuccess\n");
    } else {
      AWS_LOG_ERROR(__func__, "s3Client.ListObjectsV2 failed\n");
    }

  }

  Aws::ShutdownAPI(options);
  Aws::Utils::Logging::ShutdownAWSLogging();

  // Exit only something is wrong
  return -1;
}

Here is logs.zip (of keepalive_enabled.log and keepalive_disabled.log): logs.zip

KaibaLopez commented 3 years ago

Hi @bzhou-sw , Yea I'm able to reproduce this, we'll take a closer look into this, but yea this all looks weird to me, keepalive should not even be related to this, it should only affect delays when the request has been made and not in between requests.

For now I 'd say just disable it since that works for you.

bzhou-sw commented 3 years ago

@KaibaLopez Thanks for looking into this.

ebinans commented 3 years ago

This is most likely caused by SIGPIPE.

SDK uses following option to make it compatible with multi-thread applications:

curl_easy_setopt(handle, CURLOPT_NOSIGNAL, 1L);

https://curl.se/libcurl/c/CURLOPT_NOSIGNAL.html

That in turn means that you need to handle the signal yourself e.g.:

signal(SIGPIPE, SIG_IGN);
bzhou-sw commented 3 years ago

@ebinans Thanks for your comments. Ignore SIGPIPE in our application solved the issue!

wps132230 commented 3 years ago

Or you may try this in your options when initializing your application:

options.httpOptions.installSigPipeHandler = true;
bzhou-sw commented 3 years ago

@wps132230 Thanks!

github-actions[bot] commented 3 years ago

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.

adamcalhoon commented 2 years ago

@KaibaLopez -- I am running into this issue with version 1.9.176 of the SDK library. In particular it happens occasionally when calling GetObject.

I first tried the options.httpOptions.installSigPipeHandler = true; suggestion which did not solve the problem. Next I tried the enableTcpKeepAlive = false in the S3 Client configuration which does seem to have worked.

Should I open a new issue? Are there any other things I should try?

jmklix commented 2 years ago

Yes, please open a new issue and include all relevant information/logs

shrkamat commented 2 years ago

@adamcalhoon is the issue resolved for you. I am facing similar error, but with Openssl 1.0 version. With Openssl 1.1 version it works fine.