buildkite / test-collector-ruby

Buildkite Test Analytics collector for Ruby test frameworks
http://buildkite.com/test-analytics
MIT License
15 stars 26 forks source link

Connection timeouts connecting to new endpoint causing step failures #190

Closed jebentier closed 1 year ago

jebentier commented 1 year ago

I'm currently working to get our test collector gem upgraded since the old websocket path is about to be deprecated and shut off, and I'm getting connection timeouts from my buildkite tests trying to connect to the API endpoint using the latest version 2.2.0. The connection timeouts are currently coming from some internal mocking that I need to resolve, but this is showing a concerning issue that connection failures are making the test thread fail (there was a similar issue with the gem originally where this was the case as well).

Example Failure:

#<Thread:0x0000556393104720 [REDACTED]/ruby/2.7.0/gems/buildkite-test_collector-2.2.0/lib/buildkite/test_collector/uploader.rb:39 run> terminated with exception (report_on_exception is true):
/usr/local/lib/ruby/2.7.0/resolv-replace.rb:25:in `initialize': Failed to open TCP connection to analytics-api.buildkite.com:443 (Connection timed out - connect(2) for "168.254.1.1" port 443) (Errno::ETIMEDOUT)
    from /usr/local/lib/ruby/2.7.0/resolv-replace.rb:25:in `initialize'
    from [REDACTED]/ruby/2.7.0/gems/socksify-1.7.1/lib/socksify.rb:178:in `initialize'
    from [REDACTED]/ruby/2.7.0/gems/http_failover-2.1.1/lib/http_failover/tcp_socket_extension.rb:19:in `initialize_with_name_forcing'
    from /usr/local/lib/ruby/2.7.0/net/http.rb:960:in `open'
    from /usr/local/lib/ruby/2.7.0/net/http.rb:960:in `block in connect'
    from [REDACTED]/ruby/2.7.0/gems/timeout-0.3.2/lib/timeout.rb:189:in `block in timeout'
    from [REDACTED]/ruby/2.7.0/gems/timeout-0.3.2/lib/timeout.rb:196:in `timeout'
    from /usr/local/lib/ruby/2.7.0/net/http.rb:958:in `connect'
    from /usr/local/lib/ruby/2.7.0/net/http.rb:943:in `do_start'
    from /usr/local/lib/ruby/2.7.0/net/http.rb:932:in `start'
    from [REDACTED]/ruby/2.7.0/gems/webmock-3.14.0/lib/webmock/http_lib_adapters/net_http.rb:109:in `request'
    from [REDACTED]/ruby/2.7.0/gems/buildkite-test_collector-2.2.0/lib/buildkite/test_collector/network.rb:17:in `request'
    from [REDACTED]/ruby/2.7.0/gems/buildkite-test_collector-2.2.0/lib/buildkite/test_collector/http_client.rb:41:in `post_json'
    from [REDACTED]/ruby/2.7.0/gems/buildkite-test_collector-2.2.0/lib/buildkite/test_collector/uploader.rb:42:in `block in upload'

Example of the failure causing the whole job to fail: image

niceking commented 1 year ago

Looking into this! PR up shortly :)

jebentier commented 1 year ago

@niceking thanks for the fast turn around on this. The fix you’ve merged looks great. Appreciate the urgency on this.

niceking commented 1 year ago

No worries @jebentier, v2.3.0 has been released which contains this fix but hopefully it hadn't come up for you again!