urql-graphql / urql

The highly customizable and versatile GraphQL client with which you add on features like normalized caching as you grow.
https://urql.dev/goto/docs
MIT License
8.64k stars 451 forks source link

Retry-exchange Operations never resolving in react-native #1932

Closed Mookiies closed 2 years ago

Mookiies commented 3 years ago

urql version & exchanges:

"urql": "^2.0.3",
"@urql/exchange-retry": "^0.3.0",
"react-native": "~0.63.4"
"expo": "~41.0.1"

Steps to reproduce In react-native app:

  1. Setup queries to fail and be retried with retryExhcnage
  2. Fire off query

Expected behavior Query retried maxNumberAttempts times and then resolves

Actual behavior

Demo: https://snack.expo.dev/@git/github.com/Mookiies/request-policy-demo-native change expo to 41.0.0 in bottom bar Screen Shot 2021-09-07 at 5 07 13 PM

In this demo the toggle todos button will cause text to render when an error comes back from a useQuery and showing loading until then. It is common for it to get stuck in loading and never receive an error. This happens basically every time ios, sometimes android, never web (all same expo snack code).

Commenting out the retry exchange causes this problem to go away. When running locally with debugExchange you can see that operations stop passing between fetch and retry exchanges.

kitten commented 3 years ago

There's not really much that can be done here unfortunately. Running this on Web/RNW it works consistently. Lowering the amount of attempts and delay helps to start debugging this. Replacing the output of the one-off request with some UI text also makes it easier to show that part: https://snack.expo.dev/ESztrfHeN

However, there isn't really much we can do from our end when requests seemingly lead to no result in some cases but only on React Native. I've observed before that some weirdness seems to be happening when fetch is triggered in the same tick/synchronously as other Promise operations. This just wasn't ever consistently observable to us however. We've seen this before when AsyncStorage was used on the first tick and a fetch request was made right after.

Mookiies commented 3 years ago

👍 Thanks for this insight here about other similar issues. When I get the chance I will try to dive deeper into debugging the native side to try and get to the bottom of this since it seems like it may be a deeper React Native issue.

haswalt commented 3 years ago

@Mookiies I'll try and get some debug time on this. We're seeing it on a ReactNative app as well. Sometimes requests just stop resolving and fetching stays true for ever.

I've not observed this with any other client (Apollo / Axios) or using fetch directly so does seem to be something related to urql.

I've not managed to reproduce it consistently however, but I have a theory that it night be related background suspension.

kitten commented 3 years ago

@haswalt @Mookiies I'm actually curious whether it's related to their fetch implementation; have any of you two tried an alternative fetch polyfill that uses React Native's XMLHttpRequest shim instead? Something like unfetch for instance? https://github.com/developit/unfetch If it's passed into the Client as the fetch option it's possible that this works around this issue.

haswalt commented 3 years ago

@kitten I'll give it a go, it's very hard to pin down exactly how to trigger it so we may get some false positives here but we're rolling out updates several times a day at the minute so should be able to trigger it at least once. Will report back

haswalt commented 3 years ago

@kitten have tested using unfetch as the fetch polyfill and I get the same result.

Worth noting that I have other apps using network requests, one using native fetch and the other using axios. The only one that get's the issue seems to be the one using urql so i'm fairly convinced it's some interaction between urql and the network stack.

I'm using a few exchanges in my setup at the moment:

dedupExchange, errorExchange, retryExchange, authExchange, multipartFetchExchange.

I've got some minor config adjustments from the defaults too:

// client config
{
  requestPolicy: "network-only"
}

// retry config
{
  initialDelayMs: 1000,
  maxDelayMs: 5000,
  randomDelay: true,
  maxNumberAttempts: 2
}

Not sure if any of these intersect in a way causing problems with operations resolving. Happy to try out some other things to narrow this down.

haswalt commented 3 years ago

I should say it is hard to debug this on device as running from xcode or in simulator seems to prevent this from happening.

haswalt commented 3 years ago

We've implemented a retry button for when this happens using refetch({requestPolicy: 'network-only'});. It doesn't seem like you force urql to kick in resolving operations using this.

haswalt commented 3 years ago

@Mookiies have you had any luck debugging?

Mookiies commented 3 years ago

I haven't had too much time to dig into this, and for the time being I've removed the retryExchange in my project.

I tried @kitten's suggestion to replace fetch with https://github.com/developit/unfetch in that snack I made but operations are still getting stuck.

I also tested this out on the latest RN version 0.66 and still saw the same behavior.

haswalt commented 3 years ago

Thanks @Mookiies.

Did removing retryExchange solve it for you?

Mookiies commented 3 years ago

Yeah removing the retryExchange solves it for me, as in I've never had an operation not resolve with it gone

Mookiies commented 3 years ago

It seems to me like this is somehow an issue with the retryExchange itself? By adding debug exchanges around it, I see the operation is always sent last into the retryExchange but results or operation never makes it out.

Ex this exchange setup:

  exchanges: [
    dedupExchange,
    cacheExchange,
    debugExchange({
      onCompleted: (op) => console.log('result from retryExchange'),
      onIncoming: (op) => console.log('incoming to retryExchange')
    }),
    retryExchange({
      retryIf: () => true,
      maxNumberAttempts: 5,
      maxDelayMs: 2000,
    }),
    debugExchange({
      onCompleted: (op) => console.log('result from fetchExchange'),
      onIncoming: (op) => console.log('incoming to fetchExchange')
    }),
    fetchExchange,
  ],

Produces these logs (with a one-off failing mutation):

 LOG  incoming to retryExchange
 LOG  incoming to fetchExchange
 LOG  result from fetchExchange
 LOG  incoming to fetchExchange
 LOG  result from fetchExchange

(this is using the customizable debugExchange from https://github.com/FormidableLabs/urql/pull/1743)

JoviDeCroock commented 3 years ago

I dove a bit deeper into this after your comment and started noticing that the delay seemed to be causing this, removing it made the issue go away. I've attempted changing this to a static amount, 500 & 1000 but to no avail, it seems that the main issue is just the delay function itself 😅 when looking at the implementation of delay I couldn't find anything to explain this issue as it just leverages setTimeout

Hmm might be related:

haswalt commented 3 years ago

Interesting, although both seem to be related to the debugger / native. We're seeing this on production builds on device.

EDIT: misread the first issue. That does look like it might be related to this.

Mookiies commented 3 years ago

@JoviDeCroock I tried removing delay and that stops the operations from not resolving. However I'm also not sure why delay is causing problems. It doesn't seem like the remote debugger is to blame here because

I also tried this on 0.66 with Hermes enabled and was able to see the same behavior of operations never exiting the retry exchange.

JoviDeCroock commented 3 years ago

Yeh, the weird thing is this doesn't seem to happen on web at all so I'm thinking it has something to do with how setTimeout/ticks are handled in React-Native

paulrostorp commented 2 years ago

I managed to get it working by creating (copy-pasting) a RetryExchange with a tweaked delay method that uses https://github.com/ocetnik/react-native-background-timer

It works, but it feels "hacky"... I have only a very superficial understand of wonka and it's reason/oCaml syntax.

(only tested in ios simulator)

paulrostorp commented 2 years ago

I wrote a custom hook, useRetryQuery that wraps useQuery... it's a compromise, but I'm sharing it in case anyone else is also blocked by this 🙂

Click to expand!

```js import React from "react"; import { useQuery, UseQueryArgs } from "urql"; import { CombinedError } from "@urql/core"; interface RetryOptions { initialDelayMs?: number; maxDelayMs?: number; randomDelay?: boolean; maxNumberAttempts?: number; /** Conditionally determine whether an error should be retried */ retryIf?: (error: CombinedError, retryCount: number) => boolean; } interface UseRetryQueryResponse { fetching: boolean; error?: CombinedError; data?: Data; } const jitter = (delay: number): number => { return (Math.random() + 0.5) * delay; }; const delay = (opts: { retryCount: number; maxDelayMs: number; initialDelayMs: number; randomDelay: boolean; }): number => { const delayCalc = opts.initialDelayMs * Math.pow(1.2, opts.retryCount); return Math.min( opts.randomDelay ? jitter(delayCalc) : delayCalc, opts.maxDelayMs ); }; export function useRetryQuery< Data = unknown, Variables = Record >( args: UseQueryArgs, retryOpts: RetryOptions = {} ): UseRetryQueryResponse { const { initialDelayMs = 1000, maxDelayMs = 3000, maxNumberAttempts = 5, randomDelay = false, retryIf = (): boolean => true, } = retryOpts; const [{ fetching, data, error }, refetch] = useQuery(args); const [retryCount, setRetryCount] = React.useState(0); const timer = React.useRef(null); React.useEffect(() => { return (): void => { if (timer.current) { // cleanup clearTimeout(timer.current); timer.current = null; setRetryCount(0); } }; }, []); return React.useMemo>(() => { const willRetry = error && retryIf(error, retryCount) && retryCount < maxNumberAttempts; if (willRetry && !fetching) { if (!timer.current) { // avoids creating more than one timer timer.current = setTimeout(() => { setRetryCount(retryCount + 1); refetch({ requestPolicy: "network-only" }); timer.current = null; }, delay({ retryCount, initialDelayMs, maxDelayMs, randomDelay })); } if (!(retryCount < maxNumberAttempts)) console.log("Retries exhausted!"); } if (willRetry) { return { fetching: true }; } else if (error && fetching) { // this avoid a extra re-render return { fetching: true }; } else { return { fetching, data, error }; } }, [error, data, fetching]); } ```

bryansum commented 2 years ago

I managed to get it working by creating (copy-pasting) a RetryExchange with a tweaked delay method that uses https://github.com/ocetnik/react-native-background-timer

It works, but it feels "hacky"... I have only a very superficial understand of wonka and it's reason/oCaml syntax.

(only tested in ios simulator)

@paulrostorp mind sharing the retryExchange you created? I'm interested in solving within the exchange itself, not via your hook solution (but thanks for sharing that in any case!)

paulrostorp commented 2 years ago

@bryansum It's nothing fancy I just took the code from the exchange here: https://github.com/FormidableLabs/urql/blob/main/exchanges/retry/src/retryExchange.ts and replaced the delay function on line 99 with a custom one using background timers. I copy-pasted the native js (compiled and optimized) function I found in node_modules/wonka/dist/wonka.js.

You could also look at the more "high level" one in node_modules/wonka/src/web/WonkaJs.bs.js along with the type definition in node_modules/wonka/src/web/WonkaJs.gen.tsx, but that one has a dependency on a lib called bs-platform.

Here is the code I got

```js import { makeSubject, share, pipe, merge, filter, fromValue, delay, mergeMap, takeUntil, } from "wonka"; import { makeOperation, Exchange, Operation, CombinedError, OperationResult, } from "@urql/core"; import { sourceT } from "wonka/dist/types/src/Wonka_types.gen"; import BackgroundTimer from "react-native-background-timer"; export interface RetryExchangeOptions { initialDelayMs?: number; maxDelayMs?: number; randomDelay?: boolean; maxNumberAttempts?: number; /** Conditionally determine whether an error should be retried */ retryIf?: (error: CombinedError, operation: Operation) => boolean; /** Conditionally update operations as they're retried (retryIf can be replaced with this) */ retryWith?: ( error: CombinedError, operation: Operation ) => Operation | null | undefined; } const customDelay = (a: number) => { return function (b) { return function (c) { let e = 0; return b(function (b) { "number" == typeof b || b.tag ? ((e = (e + 1) | 0), BackgroundTimer.setTimeout(function (a) { 0 !== e && ((e = (e - 1) | 0), c(b)); }, a)) : c(b); }); }; }; }; export const retryExchange = ({ initialDelayMs, maxDelayMs, randomDelay, maxNumberAttempts, retryIf, retryWith, }: RetryExchangeOptions): Exchange => { const MIN_DELAY = initialDelayMs || 1000; const MAX_DELAY = maxDelayMs || 15000; const MAX_ATTEMPTS = maxNumberAttempts || 2; const RANDOM_DELAY = randomDelay || true; return ({ forward, dispatchDebug }) => ops$ => { const sharedOps$ = pipe(ops$, share); const { source: retry$, next: nextRetryOperation } = makeSubject(); const retryWithBackoff$ = pipe( retry$, mergeMap((op: Operation) => { const { key, context } = op; const retryCount = (context.retryCount || 0) + 1; let delayAmount = context.retryDelay || MIN_DELAY; const backoffFactor = Math.random() + 1.5; // if randomDelay is enabled and it won't exceed the max delay, apply a random // amount to the delay to avoid thundering herd problem if (RANDOM_DELAY && delayAmount * backoffFactor < MAX_DELAY) { delayAmount *= backoffFactor; } // We stop the retries if a teardown event for this operation comes in // But if this event comes through regularly we also stop the retries, since it's // basically the query retrying itself, no backoff should be added! const teardown$ = pipe( sharedOps$, filter(op => { return ( (op.kind === "query" || op.kind === "teardown") && op.key === key ); }) ); dispatchDebug({ type: "retryAttempt", message: `The operation has failed and a retry has been triggered (${retryCount} / ${MAX_ATTEMPTS})`, operation: op, data: { retryCount, }, }); // Add new retryDelay and retryCount to operation return pipe( fromValue( makeOperation(op.kind, op, { ...op.context, retryDelay: delayAmount, retryCount, }) ), customDelay(delayAmount), // Stop retry if a teardown comes in takeUntil(teardown$) ); }) ); const result$ = pipe( merge([sharedOps$, retryWithBackoff$]), forward, share, filter(res => { // Only retry if the error passes the conditional retryIf function (if passed) // or if the error contains a networkError if ( !res.error || (retryIf ? !retryIf(res.error, res.operation) : !retryWith && !res.error.networkError) ) { return true; } const maxNumberAttemptsExceeded = (res.operation.context.retryCount || 0) >= MAX_ATTEMPTS - 1; if (!maxNumberAttemptsExceeded) { const operation = retryWith ? retryWith(res.error, res.operation) : res.operation; if (!operation) return true; // Send failed responses to be retried by calling next on the retry$ subject // Exclude operations that have been retried more than the specified max nextRetryOperation(operation); return false; } dispatchDebug({ type: "retryExhausted", message: "Maximum number of retries has been reached. No further retries will be performed.", operation: res.operation, }); return true; }) ) as sourceT; return result$; }; }; ```

You can see the custom delay function on line 36 and where it is called on line 116 🙂

bryansum commented 2 years ago

Thanks so much @paulrostorp! Seems to be working fine for me

kitten commented 2 years ago

We don't quite have a final confirmation yet, but @urql/exchange-retry@0.3.1 should resolve this issue.