[Performance] Class based HOC implementations are costly

kidroca commented 3 years ago

If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!

What performance issue do we need to solve?

inefficient React component rendering

What is the impact of this on end-users?

app boot time
chat switch time

List any benchmarks that show the severity of the issue

Start the web app and be on a chat
Open the dev console
Go to the Performance tab
Start capturing data
Switch to another chat
Wait a few seconds for loading to settle
Stop capturing data
Review results

Task 1 - is processing the click and transitioning to another chat - it totals to 868ms for me (Starting on reportID: 71955477)
A huge part of Task 1 is dedicated to unmounting components - 546ms. All the little needles under unmountHostComponents
- About ~450 of those are withWindowDimensions HOCs
- And around 80 are withOnyx HOCs

Profile data (you can load this in Chrome/Performance): https://u.pcloud.link/publink/show?code=XZwvKwXZpXUhxs7Vdr7kaco3HDxznYmcP7sX

Proposed solution (if any)

HOC usages should be gradually reduced as much as possible They cause 1+ component for each component that uses them

I've described how to change withWindowDimensions here: https://github.com/Expensify/App/issues/2326 (DimensionsContextProvider) This will leave withWindowDimensions usages throughout the app unchanged, but they will at least not be class based components and would not run class component lifecycle

ReportActionItem should be updated not to be HOC based (if possible) as they are the most rendered component They can receive window related props from the parent renderItem as well as Onyx related props

List any benchmarks after implementing the changes to show impacts of the proposed solution (if any)

Note: These should be the same as the benchmarks collected before any changes. This is not at all the final result, but it does show some improvement after converting the withWindowDimensions wrappings not to be class based

Platform:

Where is this issue occurring?

Web (Affected more due to higher count of HOCs - popups, hoverable items etc...)
iOS
Android
Desktop App (Affected more due to higher count of HOCs - popups, hoverable items etc...)
Mobile Web

Version Number: 1.0.85-0 Logs: https://stackoverflow.com/c/expensify/questions/4856 Notes/Photos/Videos: Any additional supporting documentation Expensify/Expensify Issue URL:

View all open jobs on Upwork

MelvinBot commented 3 years ago

Triggered auto assignment to @MonilBhavsar (Engineering), see https://stackoverflow.com/c/expensify/questions/4319 for more details.

kidroca commented 3 years ago

The other large portion of execution time is taken by sendDataToConnection calls It will be the focus of a separate ticket - #4592 sendDataToConnection should be the easier problem to address as it isolated to Onyx.connect

kidroca commented 3 years ago

cc @marcaaron

marcaaron commented 3 years ago

Should we maybe limit the scope of this to just refactoring withWindowDimensions and create a separate ticket for ReportActionItem?

I'd also be curious to see the effects on chat switching since we have done the test for web but not Android. I'd be curious to see what it does for us on the native side. Is it possible to do a similar trace? I don't have much experience with the hermes debugger but I thought maybe it would be possible to profile there?

kidroca commented 3 years ago

I thought it would be possible to do this with the Hermes debugger in Flipper, but starting the profiler there seems to hang with "Initialising ..." For Android there's another way - it's part of the development menu. It would create a file, like the one shared here, that you can load inside Chrome I'll try it and post results

Since all of the hoverable popups should be noop on native, there would be less withWindowDimensions components

After analysing this more the HOCs might only be 30-50% to blame, and the rest is due to the many instances of components that use them. Most of the withWindowDimensions are used by popups and modals When we switch to a new chat all the components in the old chat that have componentWillUnmount would run it For example ContextMenuItem uses willUnmount and so there will be a lot of unmounting going on as each action item would have at least one menu with multiple menu items - that seems to be what's happening in the screenshot I've shared, but originally I've only traced it to withWindowDimensions

Yes the current scope is whatever we can do without massive changes and so far that seems only the withWindowDimensions refactor

We had a test where all popups we removed. Perhaps for a future ticket we can make the same snapshot captured here and see if this big bar of unmounts gets smaller with no popup components

marcaaron commented 3 years ago

We can just try the withWindowDimensions idea and see what happens - doesn't seem hard to do and we can eliminate it as a potential improvement. Worst case scenario some component will have less overhead. 🤷

marcaaron commented 3 years ago

Managed to get the profiler to work on Android and adding some tips for that in this PR if it helps at all

https://github.com/Expensify/App/pull/4617/files

MonilBhavsar commented 3 years ago

Marc, I assigned you as you are having context and working on this

MelvinBot commented 3 years ago

Whoops! This issue is 2 days overdue. Let's get this updated quick!

kidroca commented 3 years ago

ATM it's not possible to benchmark this with the new performance metrics, as TTI happens before the report screen mounts, where the most withWindowDimension components are

We can add marks for chat switches and then do a couple of benchmarks - switch to large, medium, with images, etc... Then do the same after the HOC is updated with the suggested changes

marcaaron commented 3 years ago

ATM it's not possible to benchmark this with the new performance metrics, as TTI happens before the report screen mounts, where the most withWindowDimension components are

Got it. Sounds like this probably will not improve the TTI then. But that's fine - it maybe will help the chat switching as you've mentioned. For that, it might be good to have the Flipper metrics we are discussing in the other issue.

kidroca commented 3 years ago

I think it might be a good idea to add mark, measure to libs/Performance.js and have them implemented using react-native-performance when canCapturePerformanceMetrics() is true, while otherwise they should be empty functions. This way we can hook them wherever we need to without constantly checking

Also seeing the plugin code, it might be possible to do something like

    new PerformanceObserver((list) => {
      list.getEntries()
          .filter(entry => entry.name.endsWith('End'))
          .forEach(entry => {
             const end = entry.name;
             const name = end.replace(/End$/, '');
             const start = `${name}Start`;
             performance.measure(name, start, end);
          }
    }).observe({type: 'mark', buffered: true});

This would allow us to only put start/end marks inside the components we'd like to meter e.g.

add a ReportScreen_Start mark inside SidebarLinks.onSelectRow
add a ReportScreen_End mark inside ReportActionsView.recordTimeToMeasureItemLayout

marcaaron commented 3 years ago

while otherwise they should be empty functions. This way we can hook them wherever we need to without constantly checking

Yep, that sounds like a good idea.

This would allow us to only put start/end marks inside the components we'd like to meter

I like that too since it's one less thing to worry about. What if we did that, but also abstracted this detail so that we can just do something like...

Performance.start(CONST.TIMING.SOMETHING); // adds the `_start`
Performance.end(CONST.TIMING.SOMETHING); // adds the `_end`

kidroca commented 3 years ago

👍 Even better I can submit a PR for this tomorrow

kidroca commented 3 years ago

Submitted the PR It should allow us to see any improvements from the suggested changes on the Flipper timeline

MelvinBot commented 3 years ago

Uh oh! This issue is overdue by 2 days. Don't forget to update your issues!

marcaaron commented 3 years ago

Gonna drop this to weekly for now. We are adding more metrics now which should make it easier to test theories.

kidroca commented 3 years ago

I still think many HOCs could be bad, I'll re-capture data now that we have a new way of measuring performance (CAPTURE_METRICS)

I'll again measure chat switching (we have perf marks for those)

"Before" case would be with all window dimension HOCs
A simulated "After" case would be with the dimension HOCs stubbed to return the wrapped component with fixed dimension values

Both pre/post tests should happen after fresh login (storage is cleared on log out) and without scrolling to the past (which would add more data to storage and force you to pre-render more)

MelvinBot commented 3 years ago

This issue has not been updated in over 15 days. eroding to Monthly issue.

P.S. Is everyone reading this sure this is really a near-term priority? Be brave: if you disagree, go ahead and close it out. If someone disagrees, they'll reopen it, and if they don't: one less thing to do!

kidroca commented 3 years ago

Run some benchmarks on Android, it seems it's almost unaffected by the withWindowDimensions HOC

Setup

Device: Samsung Galaxy S7 Edge (2016)
OS: Android 8.0
hash: https://github.com/Expensify/App/commit/ef46b127097a06c75f254684bf7d3c6df2cd0703
build: Android release build (bundled to byte code)

Before

	Run 1	Run 2	Run 3	Run 4	Run 5	Run 6	Run 7	Run 8	Run 9	Avg
Native Launch	243.0ms	236.0ms	256.0ms	229.0ms	232.0ms	227.0ms	208.0ms	193.0ms	236.0ms	228.9ms
Run JS Bundle	718.0ms	680.0ms	658.0ms	642.0ms	749.0ms	642.0ms	647.0ms	651.0ms	695.0ms	675.8ms
TTI	3070.9ms	3104.4ms	2955.4ms	3141.0ms	3331.1ms	3010.0ms	3164.1ms	3018.8ms	3046.3ms	3.09sec

After

	Run 1	Run 2	Run 3	Run 4	Run 5	Run 6	Run 7	Run 8	Run 9	Avg
Native Launch	231.0ms	196.0ms	230.0ms	219.0ms	243.0ms	252.0ms	228.0ms	281.0ms	228.0ms	234.2ms
Run JS Bundle	653.0ms	644.0ms	644.0ms	670.0ms	650.0ms	736.0ms	701.0ms	659.0ms	706.0ms	673.7ms
TTI	2903.1ms	3031.6ms	3140.3ms	3048.2ms	3183.4ms	3064.0ms	3098.1ms	3107.9ms	3052.6ms	3.07sec

Apart from that I've measured chat switch time, where I couldn't see any improvement

I also captured memory usage:

before: 363MB
after: 359MB
measured about 30sec. after init, the app settles on the listed memory usage value.

kidroca commented 3 years ago

I'm closing this one We can reopen if we find anything new

Expensify / App