Azure / azure-sdk-for-cpp

This repository is for active development of the Azure SDK for C++. For consumers of the SDK we recommend visiting our versioned developer docs at https://azure.github.io/azure-sdk-for-cpp.
MIT License
178 stars 128 forks source link

Some libcurl tests failing intermittently with segfault on linux after connectionPoolTest fails to resolve host name #1868

Closed ahsonkhan closed 3 years ago

ahsonkhan commented 3 years ago

From https://github.com/Azure/azure-sdk-for-cpp/commit/f1773636c29c1acbd2464c2bd50ca7c35c84bf3a image

https://dev.azure.com/azure-sdk/public/_build/results?buildId=777495&view=logs&j=fda9a6e2-14f0-5864-d894-5f56622f20fa&t=4d31d3ea-4e03-5255-0249-577ed0f8d090&l=2936

Does the connection pool test failing break the connection pool in some way, resulting in these seg faults for other tests? Can we harden this?

2021-03-11T01:17:15.3430038Z 23: [----------] 1 test from CurlConnectionPool
2021-03-11T01:17:15.3430561Z 23: [ RUN      ] CurlConnectionPool.connectionPoolTest
2021-03-11T01:17:15.3431031Z 23: unknown file: Failure
2021-03-11T01:17:15.3431865Z 23: C++ exception with description "Fail to get a new connection for: httpbin.org. Couldn't resolve host name" thrown in the test body.
2021-03-11T01:17:15.3432583Z 23: [  FAILED  ] CurlConnectionPool.connectionPoolTest (1 ms)
2021-03-11T01:17:15.3433265Z 23: [----------] 1 test from CurlConnectionPool (1 ms total)
2021-03-11T01:17:15.3433738Z 23: 
2021-03-11T01:17:15.3434306Z 23: [----------] Global test environment tear-down
2021-03-11T01:17:15.3434840Z 23: [==========] 1 test from 1 test suite ran. (1 ms total)
2021-03-11T01:17:15.3435303Z 23: [  PASSED  ] 0 tests.
2021-03-11T01:17:15.3435830Z 23: [  FAILED  ] 1 test, listed below:
2021-03-11T01:17:15.3436327Z 23: [  FAILED  ] CurlConnectionPool.connectionPoolTest
2021-03-11T01:17:15.3446169Z 25: [ RUN      ] CurlTransportOptions.noRevoke
2021-03-11T01:17:15.3446889Z 25: /mnt/vss/_work/1/s/sdk/core/azure-core/test/ut/curl_options.cpp:78: Failure
2021-03-11T01:17:15.3448000Z 25: Expected: response = pipeline.Send(request, Azure::Core::Context::GetApplicationContext()) doesn't throw an exception.
2021-03-11T01:17:15.3448634Z 25:   Actual: it throws.
2021-03-11T01:17:15.5775241Z  25/162 Test  #25: azure-core.CurlTransportOptions.noRevoke .................................***Exception: SegFault  0.32 sec
The following tests FAILED:
     23 - azure-core.CurlConnectionPool.connectionPoolTest (Failed)
     25 - azure-core.CurlTransportOptions.noRevoke (SEGFAULT)
     26 - azure-core.CurlTransportOptions.sslVerifyOff (SEGFAULT)
     27 - azure-core.CurlTransportOptions.httpsDefault (SEGFAULT)
     28 - azure-core.CurlTransportOptions.disableKeepAlive (SEGFAULT)

cc @vhvb1989

vhvb1989 commented 3 years ago

Haven't seen it happen again. Please re-open if you see it again

ahsonkhan commented 3 years ago

Can you try re-running that pipeline a few times (maybe 10) and see how often it fails? The intermittent resolve host name failure is expected and can be addressed separately, but the segfault when such a failure does occur could be problematic and might benefit from a closer look.