facebookresearch / playtorch

PlayTorch is a framework for rapidly creating mobile AI experiences.
https://playtorch.dev/
MIT License
830 stars 101 forks source link

iOS release mode app crash when other apps are running in background and install-jsi.ts is imported #193

Open IainBerliner opened 1 year ago

IainBerliner commented 1 year ago

Version

0.2.4

Problem Area

react-native-pytorch-core (core package)

Steps to Reproduce

  1. Create a bare react-native repository with react-native init or similar.
  2. Install react-native-pytorch-core.
  3. Insert require('react-native-pytorch-core') into the app.tsx, just so it gets imported immediately.
  4. Install to real device in release mode.
  5. When installing it to a real device, it will start automatically. Make sure to close the react-native app in the background, so that the next time you start it, it has to reload everything.
  6. Launch a different app that does background processing in iOS. Do not close this out in the background.
  7. Try to relaunch the release mode react-native app, which you had closed out.
  8. App will crash with following error (even though it is release mode, this can be seen in xCode console, if connected to Mac by USB): "PlayTorchJSIModule not found. Maybe try rebuilding the app."

From what I can understand, this comes from a file, install-jsi.ts, which tries to install PlayTorchJSIModule. It does this by importing NativeModules, setting a variable equal to NativeModules.PlayTorchJSIModule, and then checking if this variable is null. If it is, it fails with the crash above, and this tends to happen when other apps are doing background stuff.

There are a number of free apps on the Appstore that perform egregious background processing immediately upon start, that can be used to reproduce this error nearly 100% of the time, simply by launching them first and then attempting to start the react-native app which you have built. I know the name of one, but I'm not sure if it would be appropriate to mention the name of a commercial app here, and include it in error reproduction steps, since this repository is after all managed by a corporation and I would understand if you would not want me to (even accidentally) advertise the name of a different company's app in a GitHub issue. Nonetheless, if the person(s) reading this issue have difficulty reproducing this error and are okay with me sharing the app name, I would be glad to do so.

Expected Results

react-native-pytorch-core can be required properly, without an app crash, even if other apps are running background processes on iOS.

Code example, screenshot, or link to repository

No response

IainBerliner commented 1 year ago

This was my first time submitting a GitHub issue. I'm sorry if it was vague or unhelpful. On that note, I fixed this issue, at least for myself, so I'll post the solution here.

There is a race condition in the iOS file that calls the installation of __torchlive__, PyTorchCoreJSI.mm. In this file, in the install function, it checks if the RCTCxxBridge.runtime exists yet, and if it does not, it simply does not install __torchlive__, which can lead to javascript throwing an exception and then a subsequent crash when importing this module. This race condition usually occurs when memory and CPU resources are stressed, I think.

The way I fixed this, I'm sure, was not very good. So I won't submit a pull request. But it worked for me, so I'll explain here. I added an else block that runs if RCTCxxBridge.runtime does not exist yet, in the install function of the PyTorchCoreJSI.mm file. In this else block I call [NSThread sleepForTimeInterval:0.1], and then I call [self install], essentially making the thread wait and repeatedly try to install __torchlive__ until RCTCxxBridge.runtime exists. The sleep call is just so it doesn't waste processing power checking faster than it needs to. At least for me, this extra 0.1-0.2 seconds is pretty unnoticable, but I'm sure in general terms this is actually pretty expensive with respect to extra startup time.

Edit: I found it was also necessary to modify install-jsi.ts, so it does not throw an exception if __torchlive__ does not exist yet, which can happen even with the above modification. I wrap the entire if (global.__torchlive__) block in that file in a javascript function that gets called immediately, and that simply returns if PlayTorchJSIModule == null, rather than throwing an exception. I then used a javascript async function loop that resolves when global.__torchlive__ exists, so that my other code waits for it to be installed before calling javascript functions from this library.

raedle commented 1 year ago

Hi @IainBerliner, thanks for reporting the issue, and thanks for circling back with a workaround!

Regarding communication in this repository: it follows Meta's OSS Code of Conduct, which define guidelines around communication. If the app doesn't violate the conduct, it should be fine to post, IMO :)

It might still be worth submitting a PR (even if it isn't merged). This would help us to better understand the underlying issue, and eventually see how your solution could be adapted to provide a fix for everyone

IainBerliner commented 1 year ago

Hi @IainBerliner, thanks for reporting the issue, and thanks for circling back with a workaround!

Regarding communication in this repository: it follows Meta's OSS Code of Conduct, which define guidelines around communication. If the app doesn't violate the conduct, it should be fine to post, IMO :)

It might still be worth submitting a PR (even if it isn't merged). This would help us to better understand the underlying issue, and eventually see how your solution could be adapted to provide a fix for everyone

OKAY. I submitted a pull request with my workaround. It was automatically rejected because I didn't sign the Facebook contributor agreement - but from what you said, I assume this is fine. The name of the app is WEBTOON. I suppose I should also mention that launching it isn't necessarily enough to recreate the crash. I found, after additional experimentation, that I also needed a bunch of other apps open in the background too. So WEBTOON, or other apps performing background processing upon start, act sort of like "the straw that broke the camel's back," so to speak, when the user is already stressing memory.

I don't entirely understand why my workaround works, either. My guess is that some of the Javascript code in this module somehow gets run before the main bundle is run, and then a second time after too: but my understanding of react native is shallow and I don't know how that would be possible. It seems like for some reason the native module that is used to install __torchlive__ in the install-jsi.ts is only NOT null BEFORE the main bundle is run, and afterwards it is ALWAYS null, hence if __torchlive__ is not already installed BEFORE the main bundle runs, it tries to run the install steps and raises an error because the native module that is supposed to do that is null. Plus, this usually happens when, in the PyTorchCoreJSI.mm file, it simply does not try to install __torchlive__ because RCTCxxBridge.runtime does not exist yet.

So, my solution was to just make the native side repeatedly try installation until RCTCxxBridge.runtime exists, to just do nothing and not raise an exception if the native module that is supposed to install __torchlive__ is null, and on the Javascript side use an async function that can be awaited and that only resolves when __torchlive__ is a member of the global object.

remod commented 1 year ago

Hello @IainBerliner ,

Thank you very much for the workaround! I've experienced the same issue in my builds, I can confirm that after applying the workaround to my own app the problem disappeared. 🥳

BTW, it looks like https://github.com/facebookresearch/playtorch/issues/201 and https://github.com/facebookresearch/playtorch/issues/209 are related.

PiyushDatta commented 1 year ago

Hey, I'm running into the same issue on ios:

npx expo start
...
iOS Bundling complete 1696ms
 ERROR  Error: PlayTorchJSIModule not found. Maybe try rebuilding the app., js engine: hermes

Whats the workaround? I'm not able to find the file PyTorchCoreJSI.mm.

    "expo": "~48.0.15",
    "react-native": "0.71.7",
    "react-native-pytorch-core": "^0.2.4",
  "scripts": {
    "start": "expo start --dev-client",
    "android": "expo run:android",
    "ios": "expo run:ios",
    "web": "expo start --web"
  },
PiyushDatta commented 1 year ago

Ahh its this file right? https://github.com/facebookresearch/playtorch/blob/main/react-native-pytorch-core/ios/Jsi/PyTorchCoreJSI.mm

remod commented 1 year ago

@PiyushDatta Yes, see pull request https://github.com/facebookresearch/playtorch/pull/198