facebook / react-native

A framework for building native applications using React
https://reactnative.dev
MIT License
119.47k stars 24.37k forks source link

Android `fetch` hangs indefinitely with IPv6 hosts on some devices (Happy Eyeballs) #32730

Open andreialecu opened 2 years ago

andreialecu commented 2 years ago

Description

This is a very bizarre issue that has been previously reported a bunch of times, and this is basically a continuation of:

https://github.com/facebook/react-native/issues/29608

I initially started running into this on RN 0.66 with AWS Cognito. Bumping to 0.66.3 didn't help.

I'm also pretty sure this used to work before and I'm not sure when it broke. It's on an app that has been shelved for a while.

The problem is very strange because the network request does not seem to be issued, but simply hitting CMD+S to save any file so that a hot-reload is issued will immediately dispatch the network request.

I discovered the promise hanging issue by adding some logs to the fetch calls the cognito library was doing: image Notice how the .then is not executed.

While troubleshooting I came across a mention here of a workaround: https://github.com/facebook/react-native/issues/29608#issuecomment-884521699 (courtesy of @danmaas) which seems to completely resolve the issue.

Here's the same .then correctly being executed after applying that patch: image

Version

0.66.3

Output of react-native info

System:
    OS: macOS 12.0.1
    CPU: (8) arm64 Apple M1
    Memory: 142.27 MB / 16.00 GB
    Shell: 5.8 - /bin/zsh
  Binaries:
    Node: 16.13.0 - /private/var/folders/9p/k1yqxx0d7rn1nlztg_wm7sbw0000gn/T/xfs-2dcae145/node
    Yarn: 2.4.0-git.20210330.hash-ebcd71d5 - /private/var/folders/9p/k1yqxx0d7rn1nlztg_wm7sbw0000gn/T/xfs-2dcae145/yarn
    npm: 7.20.1 - ~/.nvm/versions/node/v16.13.0/bin/npm
    Watchman: 2021.11.01.00 - /opt/homebrew/bin/watchman
  Managers:
    CocoaPods: 1.11.0 - /Users/andreialecu/.rbenv/shims/pod
  SDKs:
    iOS SDK:
      Platforms: DriverKit 21.0.1, iOS 15.0, macOS 12.0, tvOS 15.0, watchOS 8.0
    Android SDK: Not Found
  IDEs:
    Android Studio: 4.2 AI-202.7660.26.42.7351085
    Xcode: 13.1/13A1030d - /usr/bin/xcodebuild
  Languages:
    Java: 11.0.8 - /Applications/Android Studio.app/Contents/jre/jdk/Contents/Home/bin/javac
  npmPackages:
    @react-native-community/cli: Not Found
    react: 17.0.2 => 17.0.2 
    react-native: 0.66.3 => 0.66.3 
    react-native-macos: Not Found
  npmGlobalPackages:
    *react-native*: Not Found

Steps to reproduce

I'm able to reproduce it with this:

  React.useEffect(() => {
    console.log('confirm start', new Date());
    fetch('https://cognito-idp.eu-west-1.amazonaws.com/', {
      method: 'POST',
      mode: 'cors',
    })
      .then(() => console.log('then', new Date()))
      .catch(() => console.log('catch', new Date()));
    setTimeout(() => {
      console.log('5 seconds passed');
    }, 5000);
  }, []);

Output:

Screenshot 2021-12-09 at 19 29 06

After applying https://github.com/facebook/react-native/issues/29608#issuecomment-884521699:

Screenshot 2021-12-09 at 19 30 32

Snack, code example, screenshot, or link to a repository

No response

Skip to this comment for the actual cause: https://github.com/facebook/react-native/issues/32730#issuecomment-990764376

andreialecu commented 2 years ago

Note that this doesn't repro on the emulator, this is the output on the emulator:

 LOG  confirm x 2021-12-09T18:44:08.229Z
 LOG  then x 2021-12-09T18:44:08.607Z
 LOG  5 seconds passed

❗ The device where this is happening, and where I initially noticed it on is an Oppo A72 (CPH2067) running Android 11

andreialecu commented 2 years ago

I'm also not able to reproduce it on a Samsung Galaxy Tab A (SM-T290) device.

danmaas commented 2 years ago

In my case, this issue also affected only some devices. I think it has something to do with the network stack or driver on the device. Or how they interact with IPv6 devices out on the network.

andreialecu commented 2 years ago

Interesting. I left it running for a while and it actually seems to time out on the connection:

 LOG  confirm x 2021-12-09T19:00:10.557Z
 LOG  5 seconds passed
 LOG  then x 2021-12-09T19:04:23.177Z

It took 4 minutes to time out.

I just noticed that the device does not have IPv6 connectivity at all based on https://test-ipv6.com/

Might be a router problem on my end, usually I have IPv6. However, I'm not sure why it would even attempt to use an IPv6 address considering it does not have IPv6 connectivity at all.

andreialecu commented 2 years ago

Very interesting, I rebooted the router and now I have IPv6 again.

❗ The Oppo now no longer has the error. It works perfectly:

 LOG  confirm x 2021-12-09T19:28:07.714Z
 LOG  then x 2021-12-09T19:28:09.217Z
 LOG  5 seconds passed

I'm not sure what to make of this. 🤔

vamper424 commented 2 years ago

Yo mismo me encargo de resolver lo de sus vidas, ni hay ni habrá problema alguno, temor alguno o alguien que me agreda saludo sigan igual y cada vez mejoren más, ok. Si se puede. Atte: Jesus Francisco Urias García.

vamper424 commented 2 years ago

Y nuevamente tu gol, esta vez artificial.

vamper424 commented 2 years ago

Jajaja mediocres ellos, sobre ellos.

andreialecu commented 2 years ago

The root cause of this issue seems to be the lack of "Happy Eyeballs" in the underlying okhttp library that react-native is using on Android.

Here's the issue in okhttp that is tracking this: https://github.com/square/okhttp/issues/506

I think most of the random networking Android slowdowns reported all over the ecosystem are probably related. This is the one I ran into in particular: https://github.com/aws-amplify/amplify-js/issues/5539

Possibly related issues: https://github.com/facebook/react-native/issues/32467 https://github.com/facebook/react-native/issues/29782

This thread also mentions this exact same issue: https://github.com/facebook/react-native/issues/28283#issuecomment-736941578

I believe a PR to implement Happy Eyeballs into okhttp shouldn't be too hard, if someone would be able to contribute one.

From my understanding a TCP connection to both the IPv6 and IPv4 address needs to be attempted, and the first one to respond is used, while the other one is closed.

This would solve this ubiquitous Android networking problem once and for all.

marcesengel commented 2 years ago

@andreialecu would you be so kind to review my PR?

andreialecu commented 2 years ago

@marcesengel I think this should be addressed in the underlying okhttp library. Perhaps you might want to contribute it there instead?

There is some movement there that indicates this problem might finally get some attention.

See: https://github.com/square/okhttp/pull/7009

marcesengel commented 2 years ago

@andreialecu looking at the old issue regarding this a change on their end seemed unrealistic without a new major release (due to interceptors needing to be raced, see for example this comment, which reads "It's an invasive feature to add because there's extra complexity. If you need it, don't wait for us."), which in turn would mean that it probably takes some time to adapt this to react-native.

I don't see any issue with implementing it for RN in the meantime. I'd have added it to okhttp if it didn't look like it'd take ages to catch on - in case it ever did.

andreialecu commented 2 years ago

I see. That makes sense.

Not sure who would need to take a look at this. I can't review it myself since I'm not familiar with the RN code dealing with this.

Perhaps @yungsters would be able to take a look?

cortinico commented 2 years ago

@marcesengel I think this should be addressed in the underlying okhttp library. Perhaps you might want to contribute it there instead?

Essentially this. Please take a look at my answer here: https://github.com/facebook/react-native/pull/33045#issuecomment-1030221142 as I believe that's the most feasible solution we could take at this stage.

marcesengel commented 2 years ago

I've built a patched version for use until OkHTTP 5.x is stable (now that since the first of February the implementation of Happy Eyeballs is confirmed). It's available as marcesengel/react-native#0.66-patched-3.

cortinico commented 2 years ago

Cross-posting here as I realized I accidentally left the comment on the PR instead of the Issue: https://github.com/facebook/react-native/pull/33045#issuecomment-1030428147

Would you be up for opening a draft PR with the OkHTTP bump to 5.x alpha? It could help us spot early breaking changes and we could test it against the internal infra to make sure nothing breaks. (That's also a task that someone else in the community can pickup if they wish).

marcesengel commented 2 years ago

I'd be open to do the transition once major version 5 is stable, as for now there's a fix and I wouldn't want to change something multiple times in case more breaking changes are introduced before v5 hits stable. If somebody else would like to do it beforehand and take the risk in order to make it available faster that of course would be much appreciated :+1:

Do you know if there's an ETA for the stable release?

cortinico commented 2 years ago

Nope but you can subscribe here to get updates

andreialecu commented 2 years ago

Do you know if there's an ETA for the stable release?

The upstream PR was just merged. I believe a stable release is not done yet. According to https://github.com/square/okhttp/issues/6954#issuecomment-1046608631 a new alpha is pending release.

Also, as mentioned here: https://github.com/square/okhttp/issues/506#issuecomment-1046824772

If you're prepared to deal with small API changes between now and 5.0 final you can put it into production today. For OkHttp ‘alpha’ means ‘API instability’ but the code is extremely stable.

yschimke commented 2 years ago

+1 any feedback on how the 5.0.0-Alpha05 version is working would be great. Or try using it and seeing what you need in terms of observability. Now is a good time to shape the final result.

marcesengel commented 2 years ago

I think I can find some time to create a PR this week. However as of now I really am not aware of how big the API changes are, so I can't say if I'll finish it this week.

Does somebody know if switching to major version 5 will affect Fresco?

yschimke commented 2 years ago

It should be a drop in replacement, it's backwards compatible down to 3.X. But an extra call to enable

OkHttpClient client = new OkHttpClient.Builder()
    .fastFallback(true)
    .build();
cortinico commented 2 years ago

I think I can find some time to create a PR this week.

That's great 👍 Please ping me over the PR once you're done with it.

Does somebody know if switching to major version 5 will affect Fresco?

As @yschimke mentioned, we should not need any action on the Fresco side of things. I can loop them over if this is needed anyway.

marcesengel commented 2 years ago

Small update: since I'm working on an M1 MacBook I had to first fix Buck for M1 chips and created a PR there https://github.com/facebook/buck/pull/2684 (for some reason Docker doesn't work as well and this looked easier to fix). NDK 21.4 wasn't supported as well, as it ships clang 9.0.9 instead of 9.0.8.... I was wondering how the builds at Facebook are working?

Will be able to start work on the PR tomorrow 👍

cortinico commented 2 years ago

I was wondering how the builds at Facebook are working?

I'm not on a M1 so I haven't faced those issues at all. Thanks for pointing them out and addressing them though 👍

marcesengel commented 2 years ago

Sorry for not putting it better - I wasn't meaning the M1 issues but the clang version included in the NDK, as RN is on NDK 21.4.xxx now which ships with clang 9.0.9 while Buck is looking for clang 9.0.8 vor every NDK major > 20.

Regardless I still have issues running the tests, as Buck tries to compile yoga with only -fno-omit-frame-pointer -O3 while in YGEnums.h for example constexpr is used, which would require at least C++ 11. When also passing -std=c++17 to fix this, I get String table does not end at end of file at react-native/buck-out/gen/ReactAndroid/src/main/jni/first-party/yogajni/jni#default,shared/libyoga.dylib 😖

I'd really like to get local unit tests working before I try to bump OkHttp3. Do you want me to create a new issue to tackle my problems with this?

Edit: On the official Clang website it says by default C++ 14 is used? Why am I having issues with this, maybe some Apple Clang specifics? https://clang.llvm.org/cxx_status.html

cortinico commented 2 years ago

I'd really like to get local unit tests working before I try to bump OkHttp3. Do you want me to create a new issue to tackle my problems with this?

Have you considered using the docker image instead?

marcesengel commented 2 years ago

Yes, sadly that didn't work. Using the image I got a SIGILL (Could not load hsdis-amd64.so; library not loadable; PrintAssembly is disabled). According to some sources this can be because the CPU (or in this case the x86 emulator) is missing an instruction, which could be fixed using the java flag -XX:UseSSE=2. I tried supplying it using .buckjavaargs to no avail. Alternatively hsdis-amd64.so could just actually be missing, which is why I tried building the reactnativecommunity/react-native-android image on my machine. On arch amd64 this got stuck without showing any errors etc. on the step [javac] Compiling 109 source files to /buck/ant-out/src-gen/classes [writing all of this down that might have been due to the problem described above, maybe I should try building the container with -XX:UseSSE=2]. Then I tried building the, on my machine native, arm64v8 arch which completed the step the other arch was stuck at, but ultimately failed to install the android emulator due to missing binaries iirc.

Then I concluded: my machine has an emulator and everything set up, why don't I just do everything on the host? And then I ran into the issues described in my messages above. Thinking about it again the unit tests might not require an emulator, do they? If not I could emit the emulator from my local arm64v8 image and use it to execute the unit tests.

cortinico commented 2 years ago

I'm not sure if you even need buck to try this, honestly. You can do everything on the Gradle side of things (on Buck you'll have to bump the version number of OkHTTP).

I would replicate this commit and fix any build failures you might face via the CI or via a Gradle build: https://github.com/facebook/react-native/commit/d6db5c54640cef999b6922a10ceb85dede929b3f

marcesengel commented 2 years ago

Yup been thinking the same recently, will do so until everything is ready for M1. It also seems like it's a Buck issue with the bumped 3rd party libs as someone else ran into the same problem building something else. Will also look into this.

andreialecu commented 2 years ago

Here's an excellent write-up from Reddit about this issue: https://www.reddit.com/r/RedditEng/comments/v1upr8/ipv6_support_on_android/

As you can see, it can affect things on a large scale.

This really needs to be prioritized by the core react-native team, I can't imagine it's not affecting their own apps at Meta.

yschimke commented 2 years ago

At a guess, I'd expect Meta will be exclusively using proxygen or mvfst as their client.

cortinico commented 2 years ago

This really needs to be prioritized by the core react-native team

As I already mentioned above, there is nothing to handle here. You can bump your OkHTTP version to 5 alpha and you'll get the Happy Eyeball support out of the box.

Once 5 becomes stable, we'll be bumping the default OkHTTP version in the template.

andreialecu commented 2 years ago

I have this sorted out in my projects.

Still, I think it needs to be prioritized because almost all React Native Android apps will run into this problem sooner or later, and it's not an obvious issue.

As I already mentioned above, there is nothing to handle here.

I think something should be done about it to at least make it obvious to the community that their Android apps will not work consistently 100% of the time. See all the threads I linked, and the Reddit story.

andreialecu commented 2 years ago

And if I may rehash my previous comment:

Also, as mentioned here: https://github.com/square/okhttp/issues/506#issuecomment-1046824772

If you're prepared to deal with small API changes between now and 5.0 final you can put it into production today. For OkHttp ‘alpha’ means ‘API instability’ but the code is extremely stable.

hadeskers commented 1 year ago

Add below dependencies to android/app/build.gradle to force use okhttp 5.0.0-alpha.11

  dependencies {
        // others dependencies 

        implementation 'com.squareup.okhttp3:logging-interceptor:5.0.0-alpha.11'
        implementation 'com.squareup.okhttp3:okhttp:5.0.0-alpha.11'
        implementation 'com.squareup.okhttp3:okhttp-urlconnection:5.0.0-alpha.11'
  }
lukawolf commented 1 year ago

If you are using expo and your project is managed, you can not edit android/app/build.gradle yourself. Since I've not found any expo plugin, which would make the changes for you, I just made one: expo-android-app-gradle-dependencies

JKKholmatov commented 1 year ago

@hadeskers @lukawolf @andreialecu @cortinico I created an Expo app and run expo prebuild, how should I implement okhttp:5.0.0-alpha.11 version? is there any instruction? please help!

JKKholmatov commented 1 year ago

Add below dependencies to android/app/build.gradle to force use okhttp 5.0.0-alpha.11

  dependencies {
        // others dependencies 

        implementation 'com.squareup.okhttp3:logging-interceptor:5.0.0-alpha.11'
        implementation 'com.squareup.okhttp3:okhttp:5.0.0-alpha.11'
        implementation 'com.squareup.okhttp3:okhttp-urlconnection:5.0.0-alpha.11'
  }

It did not work for me, is there any other way to implement it?

irisjae commented 1 year ago

Add below dependencies to android/app/build.gradle to force use okhttp 5.0.0-alpha.11

  dependencies {
        // others dependencies 

        implementation 'com.squareup.okhttp3:logging-interceptor:5.0.0-alpha.11'
        implementation 'com.squareup.okhttp3:okhttp:5.0.0-alpha.11'
        implementation 'com.squareup.okhttp3:okhttp-urlconnection:5.0.0-alpha.11'
  }

May I ask, having applied this, it is necessary to further modify the OkHttpClientProvider by adding .fastFallback(true) https://github.com/square/okhttp/issues/506#issuecomment-1024256588 , and setting React Native to build from source https://reactnative.dev/contributing/how-to-build-from-source#update-your-project-to-build-from-source ?

yschimke commented 1 year ago

fastFallback is the default in alpha.11 already

https://github.com/square/okhttp/blob/2344b87c59456088ed1346221d45648634f7f0c8/CHANGELOG.md?plain=1#L12C1-L12C1

irisjae commented 1 year ago

Strange, I'm still having issues... Fetch with IP works, but not with domain names. Let me look further.

irisjae commented 1 year ago

I've gotten to the bottom of my issue (hopefully), leaving it here in case it helps anyone. As I reported, after applying the Happy Eyeballs patch, I still had issues.

Fetching with IP seemed to be more reliable than with domain names. Still, with more systematic testing, I found that even when fetching via IP, my app exhibited stochastic hanging, just less frequently. Still seeing stochasticity, I thought to adjust the connection pool, so I set my app to build from source for Android, and modified OkHttpClientProvider. I knew that OkHttp kept some kind of connection pool, and assumed it was fixed in size. As my app had some long-running HTTP requests, I thought that OkHttp was waiting for the long-running requests to complete before allowing my new requests to use them from the pool. Assuming 'maxIdleConnections' was the number of connections kept in the pool, I turned the connections up. It was to no avail.

Reading more about the connection pool, it seems like instead of keeping a fixed number of connections, OkHttp tries to keep the number of connections close to Number of connections in use + Number of idle connections. It seems that OkHttp tries to somehow place 'similar' requests on to a shared connection to minimise time reestablishing connection, but on the other hand https://github.com/square/okhttp/issues/4354#issuecomment-435463922, sharing may cause failures.

I thought that perhaps with long-running requests in my app, I would be better served with no connection pool. This can be effectively done by setting 'maxIdleConnections' to 0.

My randomly hanging fetches ceased.

Edit: Attempting to disable the Connection Pool did not eliminate the issue.

As of now, I believe that the culprit is OkHttp's connection sharing somehow breaking things. For now, I removed my patch on OkHttpClientProvider.java, and instead applied a patch on NetworkingModule.java, which violently creates a separate OkHttpClient for each request and destroys the client after use (against all the advice of OkHttp), to eliminate any connection sharing. Nothing seems to hang randomly anymore so far (hopefully thats it...).

i have to say, it frustrates me greatly that the network stack on React Native Android is causing me so much trouble and tinkering with network code, when all I'm using is plain fetches with JSON. I wonder if some of the default configurations should need further improvement for React Native Android - contrasting with the networking on iOS, which has worked flawlessly.

Edit: For the record, both disabling the connection pool and violently preventing connection sharing were crucial to making network connections work reliably on low-end devices. Removing either of the patches reintroduces the issue.

yschimke commented 1 year ago

OkHttp 5 has a connection listener you can configure on the pool. It should expose all this so it's not surprising.

irisjae commented 1 year ago

How might the connection listener help?

yschimke commented 1 year ago

Maybe not solve the problem. But let you see what's going on, when connections are reused, released, fail.

consuelo-sanna commented 10 months ago

I have this sorted out in my projects.

Still, I think it needs to be prioritized because almost all React Native Android apps will run into this problem sooner or later, and it's not an obvious issue.

As I already mentioned above, there is nothing to handle here.

I think something should be done about it to at least make it obvious to the community that their Android apps will not work consistently 100% of the time. See all the threads I linked, and the Reddit story.

Good morning, my project bump on this issue since Android 14 officially come out and it took me one week before being able to connect the dots and realize it was an IPv6 problem and end up here, so I totally agree that this it's not an obvious issue and should be at least pointed out somewhere in the documentation (if there is, I could not find it) or on troubleshooting/common issues. The app I work on is for US people only, so in order for me to test it I need to go through a VPN and this made me impossible reproduce the issue since the one I chose does not support IPv6 and this made it difficult to debug

Since Android 14, every user of our app based in the USA could not use the app since the first api call instead of taking less of a second it took more than 2 minutes, I have a reproducible repo with latest RN version here if you are interested.

In order to reproduce the issue, the only thing you have to do is use an Android 14 phone with data connection and an ISP provider that supports IPv6, Verizon and AT&T for example

The only thing that solved it, was leveraging okhttp3 okhttp:5.0.0-alpha.12 (alpha 11 did not solved for us) OR patching following danmaas's example here https://gist.github.com/danmaas/c60af5fed9f55d2bc616ce302696540d

I really think this issue should be more visible, especially since Android 14 is out now and I think more and more people/apps will start experiencing this, let me know if there's something I can do to help with that, I was not sure opening a new issue on the repo was a good idea or not since there are already quite a few on the argument, like:

https://github.com/facebook/react-native/issues/28022 https://github.com/facebook/react-native/issues/28038 https://github.com/facebook/react-native/issues/29608

shubhkhanna commented 7 months ago

For me it solved the problem for IPv6 support on android.

dependencies {
      // others dependencies 

      // define a BOM and its version
      implementation(platform("com.squareup.okhttp3:okhttp-bom:5.0.0-alpha.12"))

      // define any required OkHttp artifacts without version
      implementation("com.squareup.okhttp3:okhttp")
      implementation("com.squareup.okhttp3:logging-interceptor")
      implementation("com.squareup.okhttp3:okhttp-urlconnection")
}

Add following dependencies to android/app/build.gradle to force use okhttp 5.0.0-alpha.12

"react": "18.2.0", "react-native": "0.71.11"

react-native-bot commented 1 month ago

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 7 days.

JKKholmatov commented 1 month ago

@shubhkhanna

For me it solved the problem for IPv6 support on android.

How to implement this in Expo project?