facebook / hermes

A JavaScript engine optimized for running React Native.
https://hermesengine.dev/
MIT License
9.51k stars 604 forks source link

RangeError: Maximum callstack size exceeded with many modules in single bundle #1448

Open mikeduminy opened 1 week ago

mikeduminy commented 1 week ago

Bug Description

We are unable to open our full app in development, which runs a single bundle with over 40,000 modules. In production we are using Re.Pack which enables code-splitting and does not run into this problem.

In this PR the default JS register count was doubled - https://github.com/facebook/hermes/pull/923. Sadly we are hitting the same error again.

Note that the error is not reproducible using JSC in our reproduction repo (see below).

Environment

Hermes git revision (if applicable): unknown, tied to RN versions below React Native version: 0.72.12 (our app), 0.74.2 (reproduction app) OS: OSX Platform (most likely one of arm64-v8a, armeabi-v7a, x86, x86_64): unknown, this occurs in the iOS simulator and android emulator

Steps To Reproduce

I've been able to mostly* reproduce it in this repo - https://github.com/mikeduminy/rn-reproducer.

* The error we're experiencing in our app is "RangeError: Maximum callstack size exceeded - no stack" whereas the error in this reproduction repo is simply "RangeError: Maximum callstack size exceeded" so I'm not 100% certain it has the same cause.

Steps to reproduce:

  1. Clone above repo and install dependencies
  2. Generate 50,000 modules to simulate a large app (note: no lazy imports)
  3. Run the app on iOS simulator and Android emulator
  4. Observe the red screen (note, you may need to close and re-open the app to see this)

342809230-3e23f51c-936e-459b-abec-3207ec2737f4 Screenshot_1719331985

The Expected Behavior

A unhandled JS Exception should be thrown but it would not be a call stack size exceeded error.

In order to test if the callstack resulted from executing the entrypoint module of the bundle (the last few lines of the bundle) I added a forced error just before that.

serializer: {
    getRunModuleStatement: moduleId =>
      // If we see this error, it means the module was fully parsed
      // If we see another error, it means the problem occurred during the bundle parsing
      `throw new Error("before module execution: ${JSON.stringify(
        moduleId,
      )}"); __r(${JSON.stringify(moduleId)});`,
  },

If you see this error then hermes has successfully executed the bundle. If you see the max call stack error then this error was never even reached and that is the problem we are facing in our app - the bundle is not fully executed / loaded.

neildhar commented 1 week ago

Hey @mikeduminy, could you share more about why you suspect a bug in Hermes? It is likely that loading the bundle just requires a lot of deep recursion, which exhausts Hermes' register stack. Release mode builds also enable optimisations, which means that you may not observe the bug in release mode because individual stack frames are smaller.

Some other things you could try:

  1. Increasing the register stack size further by configuring it in the Hermes RuntimeConfig.
  2. Creating and running a bundle locally with the hermes CLI tool, which should give you better visibility into what is happening.
mikeduminy commented 1 week ago

@neildhar I thought about recursion being the problem which is why I investigated the resulting bundle and found that metro is basically just declaring a bundle of modules (not executing them), then executing the entrypoint. For example:

(function(){ /* load metro runtime */ })()
(function(){ /* load react-native stuff */ })()
__d(function() { /* module code */ }, 0, [], "./index.js")
__d(function() { /* module code */ }, 1, [0], "./anotherModule.js")
/* ... more declaration calls */
throw new Error("before module execution: 0")
__r(0) // begin evaluating the modules (entrypoint)

The __d function is fairly simple, just adding the unevaluated modules into a map. The code has not changed in over a year.

Since the execution of the bundle fails before the entrypoint is run it leads me to believe that the number of available registers is the problem - i.e. not a recursion problem, also the stack trace is empty. Since it works on JSC I would advocate for an increase in default allocated registers. We cannot easily increase this number as consumers of React Native.

In our app we can actually fix this problem by randomly commenting out imports until it works. It doesn't seem to matter what we comment out, just that the resulting modules falls below some unknown count.

neildhar commented 1 week ago

Interesting, if there isn't significant recursion, that would suggest that the global function itself has such a large stack frame that it exhausts the stack. Have you been able to construct such a bundle that overflows in the Hermes CLI? That will make it much easier for us to investigate what might be going on.

mikeduminy commented 1 week ago

That's interesting, maybe modules[moduleId] = mod; is overflowing the number of properties on a single object 🤔

I've been able to replicate the overflow in my repro app so using that to generate a bundle that fails is totally possible. Let me know if you want me to make it super easy to generate the bundle and I'll prep a script.

The other thing to note is the loaded bundle is not run through the hermes transform as it is failing in dev mode. I have not tested it in production but can do so and see if the problem persists.

Can you help me with the correct commands to run to execute the bundle in the hermes CLI?

neildhar commented 1 week ago

We have instructions for building and running Hermes here.

Alternatively, you could try running against a precompiled copy of Hermes downloaded from the releases page (although they haven't been updated for some time).

Make sure that you run with -O0 (to disable optimisations) and potentially specify -max-num-registers to ensure that you're getting a representative run.

tmikov commented 1 week ago

@mikeduminy it might also be very useful if you could upload somewhere the final JS bundle of the repro app (before it is compiled by Hermes). Then we can compile it and examine the generated bytecode.

(As a rule, it is very difficult for us to reproduce React Native builds, but having the final JS bundle goes a long way)

mikeduminy commented 1 week ago

I have added both working and broken bundles for Android and iOS here. The difference between the working and broken bundles is a single additional module.

I also conducted some analysis of each bundle only to discover that the breaking point for both platforms is when the number of modules exceeds 50357.

mikeduminy commented 1 week ago

We have instructions for building and running Hermes here.

Alternatively, you could try running against a precompiled copy of Hermes downloaded from the releases page (although they haven't been updated for some time).

Make sure that you run with -O0 (to disable optimisations) and potentially specify -max-num-registers to ensure that you're getting a representative run.

Assuming you don't need me to do this anymore? Please let me know :)

neildhar commented 1 week ago

@mikeduminy Not yet at least, hopefully the bytecode output of the bundle is enough to pinpoint what is going on.

neildhar commented 1 week ago

I found the problem, the size of the global function is causing the register allocator to hit a memory limit and fall back to fast but very inefficient allocation. As a result the global function ends up singlehandedly overflowing the stack. We'll work on a fix.