Open jiri-prokop-pb opened 1 month ago
Hiya, I don't think retries were actually added for the endpoint you've hit there ๐
In terms of general flakiness for deploy, we're definitely aware and working on it. Wrangler-level retries unfortunately can't be the whole solution because we've seen some errors getting worse when we spam the api with more retries.
Also agreed that the error message is quite unhelpful. If you run set WRANGLER_LOG="debug"
in your command you can see all the requests/responses and which call fails, but it probably won't give you a more detailed error.
If you can share your account id that could help us identify the underlying api issue :)
@emily-shen thanks and ok, got it; my main problem was that it's not clear if there are actually any retries being done and I believe it would be helpful to see at least some basic info printed by default but anyway, we can try with WRANGLER_LOG="debug"
as well.
Regarding account id, we don't want to share it publicly here so my colleague will contact you over your official support channel.
Do you have any ETA about when we can expect more stable deploys? Once it's done we could plan some code simplification on our side as right now we have some workarounds in place for stable CI.
hi @jiri-prokop-pb :) we have forwarded the discussion here to the internal team working on the API ๐ too early to comment on dates for when changes will begin rolling out, but this is in progress and a high priority item.
in the meantime we are going to be adding retries for wrangler deploy, so you don't need to go back and add it to your CI system. you can track that work here: https://github.com/cloudflare/workers-sdk/pull/7122
cc @tanushree-sharma
we'll keep this open until https://github.com/cloudflare/workers-sdk/pull/7122 lands ๐
@lrapoport-cf thanks a lot ๐ we will wait patiently until it lands ๐
Which Cloudflare product(s) does this pertain to?
Wrangler
What version(s) of the tool(s) are you using?
3.79.0 [Wrangler]
What version of Node are you using?
20.16.0
What operating system and version are you using?
macOS Sequoia 15.0 (24A335) & Ubuntu Jammy (22.04.5 LTS) on CI
Describe the Bug
Observed behavior
We had retries on CI level for
wrangler
-related tasks for a while but we recently noticed that "native" retries were added in v3.79.0 with https://github.com/cloudflare/workers-sdk/pull/6801. We jumped in, upgraded to the supported version and removed our CI retry-logic only to find out that our CI is again flaky due towrangler
-related errors.We reverted back to our CI retry logic but it's not ideal. We would welcome this to work as it will simplify our configuration.
Expected behavior
Flakiness will be low with "native"/built-in retries. Ideally we won't get random failures related to
wrangler
at all if everything is correct on our side.It would be also great to have some logs for the retries. Right now it's hard to guess if the logic actually does anything. On our side it just looks like it ran once and failed immediately.
Steps to reproduce
N/A
It's random error that happens semi-regularly on our CI if we don't do retries on our side.
Please provide a link to a minimal reproduction
No response
Please provide any relevant error logs