Closed mssalvatore closed 1 year ago
The problem here is the refactoring of HTTPClient
. When we moved from tunneling to relays, we decided that we shouldn't retry the initial connection to the Island API server. If you look at the HTTPClient.connect()
method in v2.0.0, you'll see it uses requests.get()
, but in the new code we removed connect()
and so now it's using session.get()
, which has retries. this means connection to island servers is retried 4 times and our keep_tunnel_open times
are too low.
Increasing keep_tunnel_open
times is not a good solution. We should find a way to skip retries on initial connection. Connections are tried in parallel. For example, if we try 5 servers and 4 fail but 1 succeeds, we return the 1 successful. If retries are enabled, the 4 that failed will retry (a few times), even though we already know we have a successful connection to the Island. This delays the agent's startup because the worker pool won't return until all workers have completed.
Describe the bug
Docker ETE tests fail (authentication issue) AppImage fails to exploit some hosts
Tasks