inspec / inspec-azure

InSpec Azure Resource Pack
https://www.inspec.io/
Other
97 stars 80 forks source link

Resource is failed due to Failed to open TCP connection to management.azure.com:443 #724

Open jakaxd opened 3 months ago

jakaxd commented 3 months ago

Describe the problem

When running integration tests intermittently we are seeing the below error message - however when re-running the test it usually will pass and not encounter the same connectivity issue.

For context, each day we run around 10,000 individual tests and typically 1, or 2 of these test cases will fail with a similar message and it appears to only be happening for controls in which we are using the azure_generic_resource, however has been found on the azure_webapp resource too.

Resource is failed due to Failed to open TCP connection to management.azure.com:443 (Connection timed out - user specified timeout). Error backtrace:/opt/inspec/embedded/lib/ruby/3.1.0/net/http.rb:1001:in `rescue in connect' /opt/inspec/embedded/lib/ruby/3.1.0/net/http.rb:997:in `connect' /opt/inspec/embedded/lib/ruby/3.1.0/net/http.rb:976:in `do_start' /opt/inspec/embedded/lib/ruby/3.1.0/net/http.rb:965:in `start' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/faraday-net_http-1.0.1/lib/faraday/adapter/net_http.rb:138:in `request_via_get_method' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/faraday-net_http-1.0.1/lib/faraday/adapter/net_http.rb:129:in `request_with_wrapped_block' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/faraday-net_http-1.0.1/lib/faraday/adapter/net_http.rb:122:in `perform_request' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/faraday-net_http-1.0.1/lib/faraday/adapter/net_http.rb:66:in `block in call' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/faraday-1.10.3/lib/faraday/adapter.rb:50:in `connection' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/faraday-net_http-1.0.1/lib/faraday/adapter/net_http.rb:64:in `call' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/faraday_middleware-1.2.0/lib/faraday_middleware/response_middleware.rb:36:in `call' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/faraday-retry-1.0.3/lib/faraday/retry/middleware.rb:140:in `call' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/faraday-1.10.3/lib/faraday/rack_builder.rb:154:in `build_response' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/faraday-1.10.3/lib/faraday/connection.rb:516:in `run_request' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/faraday-1.10.3/lib/faraday/connection.rb:202:in `get' libraries/backend/azure_connection.rb:249:in `send_request' libraries/backend/azure_connection.rb:127:in `rest_api_call' libraries/azure_backend.rb:156:in `rescue_wrong_api_call' libraries/azure_backend.rb:276:in `get_resource' libraries/azure_generic_resource.rb:55:in `block in initialize' libraries/azure_backend.rb:460:in `catch_failed_resource_queries' libraries/azure_generic_resource.rb:54:in `initialize' libraries/azure_webapp.rb:23:in `initialize' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/resource.rb:127:in `block in initialize' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/resource.rb:60:in `supersuper_initialize' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/resource.rb:125:in `initialize' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/profile_context.rb:256:in `new' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/profile_context.rb:256:in `block (2 levels) in add_registry_methods' ./controls/function_app.rb:23:in `block (2 levels) in load_with_context' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/rule.rb:49:in `instance_eval' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/rule.rb:49:in `initialize' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/control_eval_context.rb:58:in `new' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/control_eval_context.rb:58:in `control' ./controls/function_app.rb:22:in `block in load_with_context' ./controls/function_app.rb:21:in `each' ./controls/function_app.rb:21:in `load_with_context' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/profile_context.rb:171:in `instance_eval' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/profile_context.rb:171:in `load_with_context' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/profile_context.rb:155:in `load_control_file' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/profile.rb:232:in `block in collect_tests' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/profile.rb:227:in `each' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/profile.rb:227:in `collect_tests' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/runner.rb:123:in `block in load' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/runner.rb:104:in `each' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/runner.rb:104:in `load' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/runner.rb:163:in `run' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/cli.rb:381:in `exec' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/thor-1.2.2/lib/thor/command.rb:27:in `run' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/thor-1.2.2/lib/thor/invocation.rb:127:in `invoke_command' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/thor-1.2.2/lib/thor.rb:392:in `dispatch' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/thor-1.2.2/lib/thor/base.rb:485:in `start' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-core-5.22.40/lib/inspec/base_cli.rb:35:in `start' /opt/inspec/embedded/lib/ruby/gems/3.1.0/gems/inspec-bin-5.22.40/bin/inspec:11:in `<top (required)>' /usr/bin/inspec:284:in `load' /usr/bin/inspec:284:in `<main>'

Possible Solution

In an effort to resolve the issue, I've experimented with implementing retry logic using the HTTP client parameters specified in the documentation: https://github.com/inspec/inspec-azure?tab=readme-ov-file#http_client-parameters.

I've tried two methods to include these parameters: adding them directly into the describe block, like this:

describe azure_network_security_group(resource_group: value[:resource_group_name], name: key, azure_retry_limit: 5) do

And also setting them as environment variables inside the ruby file, like this:

ENV['AZURE_RETRY_LIMIT'] = '10'
ENV['AZURE_RETRY_BACKOFF'] = '5'

or:

ENV['azure_retry_limit'] = '10'
ENV['azure_retry_backoff'] = '5'

Failing this, I even tried setting them as environment variables on the pipeline job, like this:

export AZURE_RETRY_LIMIT=10
export AZURE_RETRY_BACKOFF=10

or:

export azure_retry_limit=10
export azure_retry_backoff=10

However, none of these approaches seem to have any effect. Even when the tests run successfully, there's no noticeable increase in runtime when compared to failed tests. To validate this, I even attempted setting the azure_retry_backoff option to 30, expecting a significant increase in runtime for the tests, but observed no change.

I am looking for some advice, or tips on how we can resolve this permanently.