cloudflare / workers-sdk

โ›…๏ธ Home to Wrangler, the CLI for Cloudflare Workersยฎ
https://developers.cloudflare.com/workers/
Apache License 2.0
2.73k stars 724 forks source link

๐Ÿ› BUG: recently added wrangler retries doesn't seem to help with flakiness #6913

Open jiri-prokop-pb opened 1 month ago

jiri-prokop-pb commented 1 month ago

Which Cloudflare product(s) does this pertain to?

Wrangler

What version(s) of the tool(s) are you using?

3.79.0 [Wrangler]

What version of Node are you using?

20.16.0

What operating system and version are you using?

macOS Sequoia 15.0 (24A335) & Ubuntu Jammy (22.04.5 LTS) on CI

Describe the Bug

Observed behavior

We had retries on CI level for wrangler-related tasks for a while but we recently noticed that "native" retries were added in v3.79.0 with https://github.com/cloudflare/workers-sdk/pull/6801. We jumped in, upgraded to the supported version and removed our CI retry-logic only to find out that our CI is again flaky due to wrangler-related errors.

We reverted back to our CI retry logic but it's not ideal. We would welcome this to work as it will simplify our configuration.

Expected behavior

Flakiness will be low with "native"/built-in retries. Ideally we won't get random failures related to wrangler at all if everything is correct on our side.

It would be also great to have some logs for the retries. Right now it's hard to guess if the logic actually does anything. On our side it just looks like it ran once and failed immediately.

Steps to reproduce

N/A

It's random error that happens semi-regularly on our CI if we don't do retries on our side.

Please provide a link to a minimal reproduction

No response

Please provide any relevant error logs

  > wrangler deploy

   โ›…๏ธ wrangler 3.79.0
  -------------------

  Total Upload: 296.15 KiB / gzip: 70.73 KiB
  Your worker has access to the following bindings:
  - Vars:
    - ENVIRONMENT: "dev"

  โœ˜ [ERROR] A request to the Cloudflare API (/accounts/xxxyyyzzz/workers/scripts/aaabbbccc/deployments) failed.

    workers.api.error.unknown [code: 10013]
emily-shen commented 1 month ago

Hiya, I don't think retries were actually added for the endpoint you've hit there ๐Ÿ˜…

In terms of general flakiness for deploy, we're definitely aware and working on it. Wrangler-level retries unfortunately can't be the whole solution because we've seen some errors getting worse when we spam the api with more retries.

Also agreed that the error message is quite unhelpful. If you run set WRANGLER_LOG="debug" in your command you can see all the requests/responses and which call fails, but it probably won't give you a more detailed error.

If you can share your account id that could help us identify the underlying api issue :)

jiri-prokop-pb commented 1 month ago

@emily-shen thanks and ok, got it; my main problem was that it's not clear if there are actually any retries being done and I believe it would be helpful to see at least some basic info printed by default but anyway, we can try with WRANGLER_LOG="debug" as well.

Regarding account id, we don't want to share it publicly here so my colleague will contact you over your official support channel.

Do you have any ETA about when we can expect more stable deploys? Once it's done we could plan some code simplification on our side as right now we have some workarounds in place for stable CI.

lrapoport-cf commented 3 weeks ago

hi @jiri-prokop-pb :) we have forwarded the discussion here to the internal team working on the API ๐Ÿ‘ too early to comment on dates for when changes will begin rolling out, but this is in progress and a high priority item.

in the meantime we are going to be adding retries for wrangler deploy, so you don't need to go back and add it to your CI system. you can track that work here: https://github.com/cloudflare/workers-sdk/pull/7122

cc @tanushree-sharma

lrapoport-cf commented 3 weeks ago

we'll keep this open until https://github.com/cloudflare/workers-sdk/pull/7122 lands ๐Ÿ‘

jiri-prokop-pb commented 3 weeks ago

@lrapoport-cf thanks a lot ๐Ÿ™Œ we will wait patiently until it lands ๐Ÿ™‚