Closed willcl-ark closed 4 months ago
The latest updates on your projects. Learn more about Vercel for Git ↗︎
Name | Status | Preview | Comments | Updated (UTC) |
---|---|---|---|---|
warnet | ✅ Ready (Inspect) | Visit Preview | 💬 Add feedback | Oct 14, 2023 9:52am |
This does obviously put slightly more workload on the bitcoind process (handling RPC requests), perhaps I should also decrease the frequency? The thing I want to avoid is excessively slow startup times, but perhaps guaranteed correct startup is worth it...
Code change makes sense -- who checks Tank health though? Nothing depends_on
the tanks, right? Although wait_for_all_tanks_status()
could probably use it in the tests since right now it just waits for status:running
on the container.
My thinking was that the addpeeraddress
and things could wait for proper status, then we could get rid of the (dodgy as ****) exponential_backoff()
decorator.
Not sure it's worth another monitoring RPC call though...
Personally I'd still like to rip out the exponential backoff decorator, or at least seriously reduce it's usage in the codebase.
yeah actually in the tests we use wait_for...
which isn't exponential but just retries up to a timeout. The tests also have two rpc call methods, one with retry and one without. I agree with you for adding network edges we could just use a 2-second retry on something like getblockcount
(like bitcoin functional tests) and once that succeeds, proceed with whatever commands we expect to work
I just feel like having an RPC call method decorated with an auto-retry (ever) is... quite the footgun
well whatever wait_for_rpc_connection()
does in bitcoin functional tests
well whatever wait_for_rpc_connection() does in bitcoin functional tests
That seems fine for scenarios.
This healthcheck is for docker daemon to poll the health of each container, and I'm unsure that having it call an rpc on n
tanks every m
seconds would be insignificant enough not to worry about... Is what I meant in second post above.
Perhaps if we set it to some high number like 60 seconds, it would low enough impact. But then it would give slow startup times, which is worse than current behaviour which just occasionally shows an RPC error in the logs and retrys.
From my previous research docker compose V2 had additional startup params for healthchecks, specifically a few options related to startup like --health-start-period
(and a few more, seemingly undocumented now on docker website) to not wait the full period before performing the first healthcheck on startup.
Whereas V3, which we are using, didn't support those at all, which makes setting high values in healthcheck undesirable IMO.
Perhaps the decorator is simpler, and we should just close this out, eh?
Currently using the PID can still result in a race where the process has started but the RPC is not actually online.
I think this should fix some spurious
onion
test failures, like this one: https://github.com/bitcoin-dev-project/warnet/actions/runs/6516835067/job/17700750888?pr=93#step:6:2278