Azure / azure-sdk-for-cpp

This repository is for active development of the Azure SDK for C++. For consumers of the SDK we recommend visiting our versioned developer docs at https://azure.github.io/azure-sdk-for-cpp.
MIT License
177 stars 126 forks source link

TransportAdapterOptions.DisableCrlValidation test is failing intermittently with a timeout error, on Win2022_Win32Api_debug_tests_winhttp_x86 and x64 #5151

Open ahsonkhan opened 1 year ago

ahsonkhan commented 1 year ago

The test is failing on an unrelated PR with a timeout, both on x86 and x64

Request error. Error Code: 12002: The operation timed out

Win2022_Win32Api_debug_tests_winhttp_x86 https://dev.azure.com/azure-sdk/public/_build/results?buildId=3246625&view=logs&j=85bfc3e6-dd05-5f88-257f-550486204084&t=744927fc-7997-531b-7e25-698f166ee20f

2023-11-09T01:28:00.1647384Z test 246
2023-11-09T01:28:00.1648188Z         Start 246: azure-core.TransportAdapterOptions.DisableCrlValidation
2023-11-09T01:28:00.1648764Z 
2023-11-09T01:28:00.1650499Z 246: Test command: D:\a\_work\1\s\build\sdk\core\azure-core\test\ut\Debug\azure-core-test.exe "--gtest_filter=TransportAdapterOptions.DisableCrlValidation" "--gtest_also_run_disabled_tests"
2023-11-09T01:28:00.1652609Z 246: Working Directory: D:/a/_work/1/s/build/sdk/core/azure-core/test/ut
2023-11-09T01:28:00.1653518Z 246: Test timeout computed to be: 10000000
2023-11-09T01:28:00.1929033Z 246: Note: Google Test filter = TransportAdapterOptions.DisableCrlValidation
2023-11-09T01:28:00.1930283Z 246: [==========] Running 1 test from 1 test suite.
2023-11-09T01:28:00.1931201Z 246: [----------] Global test environment set-up.
2023-11-09T01:28:00.1932036Z 246: [----------] 1 test from TransportAdapterOptions
2023-11-09T01:28:00.1932837Z 246: [ RUN      ] TransportAdapterOptions.DisableCrlValidation
2023-11-09T01:28:00.1995256Z 246: [2023-11-09T01:28:00.1992455Z T: 1d34] INFO  : Status operation: 1(WINHTTP_CALLBACK_STATUS_RESOLVING_NAME )
2023-11-09T01:28:00.2197095Z 246: [2023-11-09T01:28:00.2193735Z T: 6c0] INFO  : Status operation: 2(WINHTTP_CALLBACK_STATUS_NAME_RESOLVED )
2023-11-09T01:28:00.2207879Z 246: [2023-11-09T01:28:00.2205372Z T: 6c0] INFO  : Status operation: 4(WINHTTP_CALLBACK_STATUS_CONNECTING_TO_SERVER )
2023-11-09T01:28:00.2818608Z 246: [2023-11-09T01:28:00.2815735Z T: 6c0] INFO  : Status operation: 8(WINHTTP_CALLBACK_STATUS_CONNECTED_TO_SERVER )
2023-11-09T01:28:00.3918768Z 246: [2023-11-09T01:28:00.3915185Z T: 6c0] INFO  : Status operation: 16(WINHTTP_CALLBACK_STATUS_SENDING_REQUEST )
2023-11-09T01:28:00.3920464Z 246: [2023-11-09T01:28:00.3917548Z T: 6c0] INFO  : Status operation: 32(WINHTTP_CALLBACK_STATUS_REQUEST_SENT )
2023-11-09T01:28:32.1906221Z 246: [2023-11-09T01:28:32.1901595Z T: 6c0] INFO  : Status operation: 4194304(WINHTTP_CALLBACK_STATUS_SENDREQUEST_COMPLETE )
2023-11-09T01:28:32.1908666Z 246: [2023-11-09T01:28:32.1904200Z T: 18ec] INFO  : Status operation: 64(WINHTTP_CALLBACK_STATUS_RECEIVING_RESPONSE )
2023-11-09T01:28:32.1911234Z 246: [2023-11-09T01:28:32.1908096Z T: 18ec] INFO  : Status operation: 2097152(WINHTTP_CALLBACK_STATUS_REQUEST_ERROR )
2023-11-09T01:28:32.1914214Z 246: [2023-11-09T01:28:32.1911535Z T: 18ec] ERROR : Request error.  Error Code: 12002: The operation timed out
2023-11-09T01:28:32.1915275Z 246: . 1
2023-11-09T01:28:32.1916841Z 246: [2023-11-09T01:28:32.1914127Z T: 18ec] DEBUG : WinHttpRequest::~WinHttpRequest. Closing handle synchronously.
2023-11-09T01:28:32.1919017Z 246: [2023-11-09T01:28:32.1916117Z T: 18ec] INFO  : Status operation: 2048(WINHTTP_CALLBACK_STATUS_HANDLE_CLOSING )
2023-11-09T01:28:32.1921295Z 246: [2023-11-09T01:28:32.1917395Z T: 18ec] DEBUG : Closing handle; completing outstanding Close request
2023-11-09T01:28:32.1924650Z 246: unknown file: error: C++ exception with description "Error while receiving a response. Error Code: 12002: The operation timed out
2023-11-09T01:28:32.1926060Z 246: ." thrown in the test body.
2023-11-09T01:28:32.1927155Z 246: [  FAILED  ] TransportAdapterOptions.DisableCrlValidation (31999 ms)
2023-11-09T01:28:32.1928051Z 246: [----------] 1 test from TransportAdapterOptions (31999 ms total)
2023-11-09T01:28:32.1928736Z 246: 
2023-11-09T01:28:32.1929322Z 246: [----------] Global test environment tear-down
2023-11-09T01:28:32.1932408Z 246: [==========] 1 test from 1 test suite ran. (32000 ms total)
2023-11-09T01:28:32.1933331Z 246: [  PASSED  ] 0 tests.
2023-11-09T01:28:32.1933946Z 246: [  FAILED  ] 1 test, listed below:
2023-11-09T01:28:32.1934731Z 246: [  FAILED  ] TransportAdapterOptions.DisableCrlValidation
2023-11-09T01:28:32.1935504Z 246: 
2023-11-09T01:28:32.1935982Z 246:  1 FAILED TEST
2023-11-09T01:28:32.1982657Z 246/465 Test #246: azure-core.TransportAdapterOptions.DisableCrlValidation ...........................................***Failed   32.03 sec

Also here: Win2022_Win32Api_debug_tests_winhttp_x64 https://dev.azure.com/azure-sdk/public/_build/results?buildId=3246625&view=logs&j=fe290b03-58c1-5285-793c-3227e2272b55&t=271ddb24-2791-55d7-7060-2a0f97b70f60

Related issue on the libcurl side: https://github.com/Azure/azure-sdk-for-cpp/issues/4133

cc @LarryOsterman

LarryOsterman commented 1 year ago

These are live tests, connectivity issues can cause them to fail.

ahsonkhan commented 1 year ago

Should we consider not running them in CI (and only on nightly), if we expect them to be brittle? Alternatively, we add some retries to improve reliability.

LarryOsterman commented 1 year ago

Should we consider not running them in CI (and only on nightly), if we expect them to be brittle? Alternatively, we add some retries to improve reliability.

These tests aren't any more brittle than all of the other HTTP transport tests.

All tests which interact with network services will have some level of failure associated with them. We absolutely can consider removing all HTTP transport related tests from our CI pipeline, but that means that we will not have any test coverage on our HTTP transports in our CI Pipeline.

Over the past 30 days (the extent of errors recorded in AzureDevOps), this test has a 99.93% reliability. There have been 2 failures recorded over that interval.

IMHO that is not sufficiently high a level to justify disabling the test.