facebook / hermes

A JavaScript engine optimized for running React Native.
https://hermesengine.dev/
MIT License
9.91k stars 641 forks source link

3x Performance drop vs JSC in some tasks #495

Open andreialecu opened 3 years ago

andreialecu commented 3 years ago

Bug Description

It appears that using formatjs Intl polyfills can be very slow while using Hermes. I forked @karol-bisztyga's repro from #471 to show an issue with formatting dates.

Hermes is up to 3 times slower than JSC, and it makes working with i18n/timezoned dates almost impossible. It takes around 5-20ms to format one single date(!).

The same benchmark using JSC instead of Hermes on Android seems about 2-3 times faster.

Hermes version: 0.37.0 React Native version (if any): 0.64 OS version (if any): Android Platform (most likely one of arm64-v8a, armeabi-v7a, x86, x86_64):

Steps To Reproduce

Here's a repro on both engines: https://github.com/andreialecu/hermes-repro/tree/jsc-intl https://github.com/andreialecu/hermes-repro/tree/hermes-intl

see: https://github.com/andreialecu/hermes-repro/blob/hermes-intl/App.js

JSC on M1 (Android Emulator):

{
"dateTimeFormat": [919, 175, 162, 158, 147, 159, 154, 149, 156, 148], 
"local": [11, 0, 0, 1, 0, 0, 0, 0, 0, 0], 
"setZone": [1381, 184, 172, 167, 155, 183, 150, 152, 151, 154], 
"toLocaleString": [517, 175, 173, 165, 158, 157, 147, 148, 152, 147]
}

Hermes on M1 (Android Emulator):

{
"dateTimeFormat": [393, 373, 369, 343, 341, 349, 372, 364, 347, 343], 
"local": [5, 1, 0, 0, 1, 1, 0, 1, 0, 1], 
"setZone": [401, 376, 376, 355, 355, 361, 389, 379, 364, 352], 
"toLocaleString": [375, 392, 368, 368, 342, 392, 367, 379, 348, 342]
}

Notice that the times above are using an M1 apple cpu, with an arm64 native emulator, so everything should be running native.

On a physical 2020 mid-level Android device the times are about 6-7 times slower:

Hermes on Oppo A72:

{
"dateTimeFormat": [2006, 2016, 2005, 2009, 2004, 2060, 2008, 2012, 2020, 2017], 
"local": [28, 3, 3, 2, 4, 2, 3, 3, 3, 3], 
"setZone": [2214, 2077, 2073, 2067, 2075, 2129, 2080, 2078, 2075, 2077], 
"toLocaleString": [2042, 2035, 2003, 2019, 2024, 2025, 2029, 2035, 2011, 2013]}

Related: https://github.com/facebook/hermes/issues/23#issuecomment-815906036

Huxpro commented 3 years ago

I haven't been able to take any closer looks, but the first thing we should look at is probably whether or not you are running the polyfilled Intl or the native Intl on JSC.

Most of the polyfills perform a feature detection in the first place and the Format.js does no exception: https://github.com/formatjs/formatjs/blob/65017cd5dc0b72b16e44daaf088a676c73c527c4/packages/intl-locale/should-polyfill.ts#L12

andreialecu commented 3 years ago

@Huxpro JSC for Android does not have native Intl support enabled at all as per: https://github.com/andreialecu/hermes-repro/blob/jsc-intl/android/app/build.gradle#L101-L112

Additionally, the polyfills are applied directly, without calling shouldPolyfill: https://github.com/andreialecu/hermes-repro/blob/323317e0b22e1682c769ff29dddb181afe956561/App.js#L11-L31

andreialecu commented 3 years ago

I added delete global.Intl prior to installing the polyfills, and as expected there is no difference on neither JSC or Hermes.

JSC benchmark on the physical device (Oppo A72):

{
"local": [16, 1, 0, 0, 1, 1, 1, 1, 1, 0], 
"setZone": [1083, 964, 942, 941, 943, 941, 937, 938, 939, 943], 
"toLocaleString": [1000, 964, 922, 924, 928, 927, 921, 928, 924, 921]
}

Meaning it takes 10ms to format one date, still slow even on JSC, although twice as fast as Hermes.

This is also due to some performance problems with formatjs (https://github.com/formatjs/formatjs/discussions/2813), but I think it serves as a good baseline for comparing the two engines, and figuring out ways to improve Hermes.

andreialecu commented 3 years ago

For completeness sake I also tried react-native-v8

Oppo A72 react-native-v8 jit off:

{
"local": [10, 2, 1, 1, 1, 1, 1, 1, 2, 1], 
"setZone": [1369, 1316, 1324, 1320, 1322, 1318, 1320, 1318, 1319, 1319], 
"toLocaleString": [1302, 1295, 1290, 1288, 1289, 1291, 1298, 1292, 1290, 1291]}

Oppo A72 react-native-v8 jit enabled:

{
"local": [10, 1, 1, 1, 6, 1, 1, 1, 1, 1], 
"setZone": [342, 205, 162, 153, 152, 137, 133, 134, 125, 131], 
"toLocaleString": [168, 154, 132, 134, 124, 123, 128, 127, 133, 129]}

Interestingly, the JIT enabled version of v8 seems to be extremely performant on this low end device. (15 times faster than Hermes)

https://github.com/andreialecu/hermes-repro/tree/v8-intl

Huxpro commented 3 years ago

@andreialecu Thanks for setting up and posting those benchmarks. Would you be interested in benchmarking Hermes native Intl performance? I'm curious how would that perform 😛 .

Here is a link of a latest hermes-engine.tgz from CircleCI which already has Intl included and turned on and it should run fine (use npm install <path/to/local/*.tgz> to override the hermes-engine npm package) with the RNTester app under the react-native repo. You can inject your benchmark into RNTester and build the app with ./gradlew packages:rn-tester:android:app:installHermesRelease.

Note that it's probably not going to work with you existing app on RN 0.64 but you could also try ;)

andreialecu commented 3 years ago

I tried to add the new hermes build to https://github.com/andreialecu/hermes-repro/tree/hermes-intl which is a very simple, starter, RN app.

I pushed a branch here: https://github.com/andreialecu/hermes-repro/commit/8ff4559604af4050b574ad9f8d693b6b552e1969 (using yarn resolutions to override it). If you want to try and see the crash.

Seems to crash on startup (both real device and emulator):

key_process
10:02:17.123
20002
DEBUG
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
10:02:17.123
20002
DEBUG
Build fingerprint: 'OPPO/CPH2067EEA/OP4C72L1:11/RKQ1.200903.002/1615286858013:user/release-keys'
10:02:17.123
20002
DEBUG
Revision: '0'
10:02:17.123
20002
DEBUG
ABI: 'arm64'
10:02:17.124
20002
DEBUG
Timestamp: 2021-04-14 10:02:17+0300
10:02:17.124
20002
DEBUG
pid: 19932, tid: 19997, name: mqt_js  >>> com.hermes_test <<<
10:02:17.124
20002
DEBUG
uid: 10281
10:02:17.124
20002
DEBUG
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
10:02:17.124
20002
DEBUG
Cause: null pointer dereference
10:02:17.124
20002
DEBUG
    x0  0000000000000000  x1  0000000000000000  x2  0000000000000000  x3  0000000000000008
10:02:17.124
20002
DEBUG
    x4  00000071f5db1cd0  x5  00000072fd9dd238  x6  00000071f5db1cc0  x7  00000000ff82eb50
10:02:17.124
20002
DEBUG
    x8  0000000000000024  x9  0000000000002974  x10 0000000000004001  x11 0000000000000000
10:02:17.124
20002
DEBUG
    x12 0000000000004100  x13 0000000000000000  x14 0000000000000030  x15 aaaaaaaaaaaaaaab
10:02:17.124
20002
DEBUG
    x16 00000071f89b7698  x17 00000072fd9ddd08  x18 00000071f7120000  x19 00000071fc1741a0
10:02:17.124
20002
DEBUG
    x20 00000071f7fc3e50  x21 00000071fc176838  x22 000000720e6e6ba0  x23 00000071fc3fef98
10:02:17.124
20002
DEBUG
    x24 00000071f7fc6000  x25 00000071f7fc3d38  x26 00000071f7fc6000  x27 00000071fc171ec8
10:02:17.124
20002
DEBUG
    x28 00000071f7fc48a0  x29 00000071f7fc3dc0
10:02:17.124
20002
DEBUG
    lr  00000071f86ad65c  sp  00000071f7fc3ca0  pc  00000071f86ad68c  pst 0000000060001000
10:02:17.280
20002
DEBUG
backtrace:
10:02:17.280
20002
DEBUG
      #00 pc 000000000011968c  /data/app/~~Q4f15bGK0kggNahGoB7Dgg==/com.hermes_test--KmpGa39TcX--O6jK3JvVw==/lib/arm64/libhermes.so (BuildId: 8d42dff8dfe7754911a705caed888cf3dbda8b98)

I haven't yet tried it on RNTester to see if it would make any difference.

However, I was able to patch the formatjs polyfills and increase Hermes performance by 50x. The underlying problem is in the ecma402 spec as described in: https://github.com/tc39/ecma402/issues/563 which formatjs follows too closely. Note however that the performance improvement mentioned above is not Hermes specific. All engines execute the same code. So Hermes is still the slowest on that work-load.

I still see some performance degradation in other parts of the app however which seem execution speed related. I haven't yet been able to profile what they are (because Hermes crashes when trying to profile, as per #491) - but it seems that if v8 with jit performs 15x faster in tasks such as this one, there could be some sort of improvement possible to Hermes.

Edit: Tried RNTester but couldn't get it to compile for Android. The master version doesn't seem to work either, failing to compile the ndk step. I'm not sure what's going on, but didn't have more time to troubleshoot.

headfire94 commented 1 year ago

@andreialecu do you know if this is still an issue?

andreialecu commented 1 year ago

Hermes Android now has intl built in so this particular issue related to intl is fixed. I'm not sure if the performance drop in other CPU intensive tasks vs JSC still exists, haven't tested that.

headfire94 commented 1 year ago

I experience slow intl function under Hermes with simulator run on m1 Mac. On this screenshot profiler report shows each execution takes around 15ms. With the list of 30 items it's around 500ms. image

andreialecu commented 1 year ago

Constructing Intl formatters is an expensive operation even with native intl.

You should cache/memoize the Intl.NumberFormat (and others as well) and not recreate it whenever possible. This recommendation is also valid for web or nodejs Intl.

headfire94 commented 1 year ago

Just to note on IOS simulator i get 3x-4x faster result