Azure / azure-storage-cpplite

Lite version of C++ Client Library for Microsoft Azure Storage
MIT License
25 stars 44 forks source link

Segmentation fault during blob_client::upload_block_blob_from_stream #87

Open drorgy opened 4 years ago

drorgy commented 4 years ago

This issue looks similar to this issue from the azure-storage-cpp project: https://github.com/Azure/azure-storage-cpp/issues/265 (in that issue it was mentioned that there were indeed segmentation fault errors in azure-storage-cpp version 3.0 which where fixed later on).

Valgrind output of the error I'm getting:

==8950== Process terminating with default action of signal 11 (SIGSEGV) ==8950== Access not within mapped region at address 0x4008 ==8950== at 0xAF4500E: std::istream::read(char, long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22) ==8950== by 0x60C7F09: azure::storage_lite::CurlEasyRequest::read(char, unsigned long, unsigned long, void*) (in /mnt/public/zdevfs2/yoav/src/dev/unittest/CppLinuxUnitTests/builds/libazure-storage-cpplite.so) ==8950== by 0x599F5A2: Curl_fillreadbuffer (in /mnt/public/zdevfs2/yoav/src/dev/unittest/CppLinuxUnitTests/builds/libcurl.so) ==8950== by 0x59A0783: Curl_readwrite (in /mnt/public/zdevfs2/yoav/src/dev/unittest/CppLinuxUnitTests/builds/libcurl.so) ==8950== by 0x59A9885: multi_runsingle (in /mnt/public/zdevfs2/yoav/src/dev/unittest/CppLinuxUnitTests/builds/libcurl.so) ==8950== by 0x59AA260: curl_multi_perform (in /mnt/public/zdevfs2/yoav/src/dev/unittest/CppLinuxUnitTests/builds/libcurl.so) ==8950== by 0x59A1CE9: curl_easy_perform (in /mnt/public/zdevfs2/yoav/src/dev/unittest/CppLinuxUnitTests/builds/libcurl.so) ==8950== by 0x60C66F7: azure::storage_lite::CurlEasyRequest::perform() (in /mnt/public/zdevfs2/yoav/src/dev/unittest/CppLinuxUnitTests/builds/libazure-storage-cpplite.so) ==8950== by 0x60C9021: azure::storage_lite::CurlEasyRequest::submit(std::function<void (int, azure::storage_lite::storage_istream, CURLcode)>, std::chrono::duration<long, std::ratio<1l, 1l> >) (in /mnt/public/zdevfs2/yoav/src/dev/unittest/CppLinuxUnitTests/builds/libazure-storage-cpplite.so) ==8950== by 0x60DDFF1: azure::storage_lite::async_executor::submit_helper(std::shared_ptr<std::promise<azure::storage_lite::storage_outcome > >, std::shared_ptr<azure::storage_lite::storage_outcome >, std::shared_ptr, std::shared_ptr, std::shared_ptr, std::shared_ptr, std::shared_ptr) (in /mnt/public/zdevfs2/yoav/src/dev/unittest/CppLinuxUnitTests/builds/libazure-storage-cpplite.so) ==8950== by 0x60E4CFB: azure::storage_lite::async_executor::submit(std::shared_ptr, std::shared_ptr, std::shared_ptr, std::shared_ptr) (in /mnt/public/zdevfs2/yoav/src/dev/unittest/CppLinuxUnitTests/builds/libazure-storage-cpplite.so) ==8950== by 0x60D3100: azure::storage_lite::blob_client::upload_block_blob_from_stream(std::cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::istream&, std::vector<std::pair<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::cxx11::basic_string<char, std::char_traits, std::allocator > > > > const&, unsigned long) (in /mnt/public/zdevfs2/yoav/src/dev/unittest/CppLinuxUnitTests/builds/libazure-storage-cpplite.so)

Thanks, Dror

Jinming-Hu commented 4 years ago

@drorgy Can you share a code snippet that can reproduce this issue?

drorgy commented 4 years ago

@JinmingHu-MSFT Thanks, we are currently working on getting code snippet to reproduce this as code base it is reproduced in is big.

drorgy commented 4 years ago

We still don't have a code snippet yet, but I came across these links: https://stackoverflow.com/questions/21887264/why-libcurl-needs-curlopt-nosignal-option-and-what-are-side-effects-when-it-is https://curl.haxx.se/mail/lib-2010-01/0115.html

Could it be that the crash is because CURLOPT_NOSIGNAL is not set to 1? (curl_easy_setopt(handle, CURLOPT_NOSIGNAL, 1L); ) I checked the azure-storage-cpplite code and didn't find CURLOPT_NOSIGNAL anywhere.

Thanks, Dror

drorgy commented 4 years ago

I just compiled after setting CURLOPT_NOSIGNAL. The segmentation faults don't happen but getting SIGPIPE and my requests to azure-storage are stuck. SIGPIPEs should be ok - https://curl.haxx.se/libcurl/c/CURLOPT_NOSIGNAL.html (so the issue here is that the requests are stuck). To see change made: https://github.com/Azure/azure-storage-cpplite/pull/89

Jinming-Hu commented 4 years ago

Hi @drorgy , can you confirm that this crash issue (SIGSEGV) is caused by DNS failure? I mean, can you double check that you're using neither c-ares nor threaded DNS?

Because the change in your PR involves behavior change of this SDK, which will affect all of our cpplite customers. So I want to be careful about this.

In addition about the SIGPIPE issue, if we don't enable CURLOPT_NOSIGNAL, the program will receive SIGPIPE and crash when there's something wrong with the TLS connection. If we enable CURLOPT_NOSIGNAL, what would happen instead?

drorgy commented 4 years ago

Hi Jinming, I saw in another similar SDK that CURLOPT_NOSIGNAL is turned on. See also this link: https://curl.haxx.se/libcurl/c/threadsafe.html Currently the code I sent in PR is not complete as my program gets stuck (I configured SIGPIPE to be ignored in my program anyway). I think there is probably additional settings needed aside from CURLOPT_NOSIGNAL. Dror