mui / mui-x

MUI X: Build complex and data-rich applications using a growing list of advanced React components, like the Data Grid, Date and Time Pickers, Charts, and more!
https://mui.com/x/
4.33k stars 1.3k forks source link

[data grid] Scroll performance regression since v7.18.0 (probably) #15168

Open lauri865 opened 2 days ago

lauri865 commented 2 days ago

Steps to reproduce

Difficult to provide a reproduction right now. It's especially pronounced in one of hour heaviest datagrids where we draw a sparkline in a cell (on a canvas). In Chrome there are occasional dropped frames, but at least it's usable still, however, on Safari, CPU is completely clogged (=maxed out) and there's severe banding and delays in rendering, which completely breaks the UX.

Performance in v7.15.0 (but seeing similar performance in v7.17.0 as well) https://github.com/user-attachments/assets/9f06d8f8-f0cc-4213-9f8d-97152fc032b7

Performance in v7.18.0: https://github.com/user-attachments/assets/cf3f3163-0413-4e57-a36e-0742ce0d2f22

Especially bad frame from the last video (nothing even remotely close to that in the first video):

Screenshot 2024-10-29 at 16 12 44

Same browser (Safari v18.1), same datagrid, only difference is the mui/x-data-grid version.

If I would go crazy with scrolling on v7.18.0, my screen would go blank. The overall picture looks even worse (half of the grid looks blank at times) than the single column, but I cannot share that view due to the nature of the data displayed.

There might be small differences in performance between v7.15.0 and v.7.17.0 as well, but that would require more measurements, as I'm so far just eyeballing it. But completely usable still.

We decided to downgrade to v7.15.0 for now just to be safe, as anything later than v17.8.0 is unusable for us.

Current behavior

Janky scroll, dropped frames.

Expected behavior

Smooth scroll, no dropped frames.

Context

No response

Your environment

npx @mui/envinfo ``` Don't forget to mention which browser you used. Output from `npx @mui/envinfo` goes here. ```

Search keywords: scroll, performance, safari

lauri865 commented 2 days ago

@MBilalShafi, could it be due to the row spanning feature? We don't have it enabled, but could potentially open up new code paths regardless. Just based on quickly scanning the latest updates and commits around the affected version; sorry if I'm on the wrong path.

Edit: Another way the issue shows itself on all browsers is the scroll performance when hovering over the scrollbar – then even on Chrome it becomes extremely choppy.

romgrk commented 2 days ago

We really need you to provide a reproduction and to fill the npx @mui/envinfo section, it's near impossible to debug without having something runnable to work with. You're saying the issue is the datagrid; if it really is, then it should be possible to reproduce the issue by simulating an expensive component to replace the sparklines and showing a difference for 7.15 and 7.18.

lauri865 commented 2 days ago

I completely understand. However, we spent the past few days debugging why our app had become sluggish recently, and narrowed it down to the Datagrid (after spending a lot of time optimising our own components to no end). And we narrowed it down to even to a specific version – so I did not open this on a whim. One cause was #15158, but it turned out we had to migrate even further back to restore a good level of performance, especially in Safari. I wish there was a common ground kitchen sink example with measurements in place that would facilitate these kinds of reports and save time for everyone involed, as it does come down to actually measuring and comparing performance, rather than eyeballing it. And that takes extra time of course, and even then it's difficult to compare performance across the history of versions if all the developers have to concoct their own repros.

Anyways, here's the repro we have for now:

  1. v7.18.0 (slow) https://stackblitz.com/edit/react-vodudv
  2. v7.17.0 (solid) https://stackblitz.com/edit/react-vodudv-zq3jym

But make sure to check it in full screen as the preview window can barely fit any columns:

  1. v7.18.0 (slow) https://react-vodudv.stackblitz.io
  2. v7.17.0 (solid) https://react-vodudv-zq3jym.stackblitz.io

Video comparing the two affected versions in Safari 18.1 (if anything, I scrolled faster in the older version, yet the performance is all around better): https://github.com/user-attachments/assets/407a22d7-038d-4f32-9548-decfd6062a3e

Again, it's not only Safari that is affected from our testing, but it became borderline unusable in Safari, compared to e.g. Chrome where it resulted in perhaps a few dropped frames here or there, but still usable performance.

I believe the performance impact is not limited to scrolling, but also mounting, as our page transitions (between pages tht both have datagrids) sped up quite significantly when migrating back to the older version.

Environment (the repro is using React 18 and the problem exists there as well):

System:
    OS: macOS 15.1
  Binaries:
    Node: 20.11.0 - ~/.nvm/versions/node/v20.11.0/bin/node
    npm: 10.2.4 - ~/.nvm/versions/node/v20.11.0/bin/npm
    pnpm: 9.12.2 - ~/Library/pnpm/pnpm
  Browsers:
    Chrome: 130.0.6723.71
    Edge: 130.0.2849.56
    Safari: 18.1
  npmPackages:
    @emotion/react: ^11.13.3 => 11.13.3
    @emotion/styled: ^11.13.0 => 11.13.0
    @mui/material: ^6.1.5 => 6.1.5
    @mui/x-data-grid: 7.17.0 => 7.17.0
    @mui/x-data-grid-pro: 7.17.0 => 7.17.0
    react: rc => 19.0.0-rc-0bc30748-20241028
    react-dom: rc => 19.0.0-rc-0bc30748-20241028
    types-react:  19.0.0-rc.1
    typescript: ^5.6.3 => 5.6.3
lauri865 commented 2 days ago

On Chrome there is a ~10 FPS difference on average between the two versions, but in certain cases we've seen much worse (I set the WAIT_DURATION to 1ms, I think if we'd ramp it up a bit more, it gets worse): https://github.com/user-attachments/assets/2b1b7b42-acac-47b7-ad21-84a1043a10b0

michelengelen commented 2 days ago

could you maybe replace the avatar cell with something that does not include a network request? You could use a chart with random values instead.

lauri865 commented 2 days ago

I added a simple div instead. The problem persists.

But in general, I strongly believe that a performance-centric demo should include images + network requests. It's the most basic requirement, and something that can affect the baseline performance. Especially in my most hated browser (Safari).

michelengelen commented 2 days ago

That's true ... I just wanted to rule that out. Thanks for the update. @romgrk @MBilalShafi should I add this to the board for a deeper look into this?

romgrk commented 1 day ago

But in general, I strongly believe that a performance-centric demo should include images + network requests

No, a performance reproduction case should be as minimal as possible. If the performance issue really comes from the datagrid, it should be possible to reproduce it without other code. You should instead use the devtools CPU slowdown to showcase a performance regression instead of adding unrelated elements. The problem with unrelated elements is that they add tons of noise to stack traces, which makes it harder to read what's going on.

I'm not sure if I can reproduce a difference by scrolling manually. I tried to reproduce a difference by automatically scrolling (element.scrollTo({ top: 20_000, behavior: 'smooth' })) but couldn't observe a meaningful difference in scripting time:

7.17 7.18
image image

I'm a bit short on bandwidth to look into this more in details, and using Safari might have an impact that I can't test. @MBilalShafi could you look into this one?

romgrk commented 1 day ago

I believe the performance impact is not limited to scrolling, but also mounting

This is surprising, if it also impacts mounting then the root cause could be very different. @lauri865 Could you get us a reproducible case of that? Investigating a mounting issue is considerably easier than a scrolling issue, due to how difficult it is to simulate scrolling in an easily reproducible manner. Also please don't use the commodity demo dataset, it contains problematic components which add noise to stack traces; prefer forking something like this example.

lauri865 commented 1 day ago

No, a performance reproduction case should be as minimal as possible.

I agree when it comes to honing in on a specific issue. But my point was more about a generic performance testbed – if you make sure that it includes all real-world requirements (=images for one, and maybe slightly more complex images than flags I might add), it's less likely that you let performance regressions slip in. You can get away with just rendering text for much longer of course, while adding image rendering + other things can break things apart completely, and showcase any changes (positive or negative) to the performance much earlier.

@romgrk, the screenshots look rather narrow – did you make it narrow only for the screenshots? It makes a big difference how many columns you render. But Safari is the core issue here in any case. And Chrome admittedly doesn't show it as much in the repro, but it's definitely more pronounced in our real application – but still something we could live with. Safari on the other hand – our users are angry.

lauri865 commented 1 day ago

After removing images, the Chrome repro doesn't really show as significant difference indeed (but to avoid any doubt, Safari is still bad). Which also illustrates my point a bit (if nothing else changed than a package version, yet images affect performance in one version but not the other, it's unlikely caused by the images). Performance issues in the browser happen when you overspend the CPU-time budget you have to render a frame. Very few things when it comes to rendering can only be tested in isolation, as there are often cascading effects all round. If it's a single slow component that's the cause, sure, maybe. Slow CSS for example, may not even register as an issue with few elements, but quickly becomes an issue depending on how many elements there are or how quickly they change in the DOM.

romgrk commented 1 day ago

Performance issues should be tested by setting CPU slowdown, not by adding unrelated components:

image

MBilalShafi commented 1 day ago

I was able to reproduce the reduced performance in Safari, on debugging it did seem to relate to the row spanning hook, but weirdly so this selector turned out to be the culprit. Commenting out everything in the hook but this line doesn't solve the issue.

I didn't dig yet into the virtualization logic, just did an experiment to remove the mentioned selector and it made the performance similar to the v7.17.0 example above for me.

@lauri865 Could you confirm the finding by cloning and running the dev server in this repo. I used the package generated by the experiment PR in it. (Ideally this should've been tested in a generated codesandbox but Codesandbox CI is having some issues atm).

The Safari performance timeline doesn't yield any particularly useful information, I will reiterate this with adding some throttling.

lauri865 commented 1 day ago

@MBilalShafi, it certainly looks better, but when I measure, it seems like the main thread is still doing more work compared to this: https://react-vodudv-zq3jym.stackblitz.io/

Maybe you can double-check how it looks on your end? The repo above:

Screenshot 2024-10-30 at 15 54 21

Stackbiltz v7.17.0:

Screenshot 2024-10-30 at 15 54 35

(I measured only because it still felt more sluggish when scrolling fast)

lauri865 commented 1 day ago

When I do: document.querySelector(".MuiDataGrid-virtualScroller").scrollTo({ top: 20_000, behavior: 'smooth' })

Repo above: most expensive scroll event is 144ms Stackblitz: 96ms

Consistently 40-50% worse on my end.

Edit: On v7.18.0 it clocks in at 175-196ms – so, half-way there.

lauri865 commented 1 day ago

As a complete side-note, there's another issue with Safari that makes the Datagrid look much worse there and has been bugging me for a while – pinned columns lagging behind the content. This exists in all versions as long as I can remember (in v7 at least), and I don't know if it's a webkit bug, due to their implementation of sticky position (creates a separate rendering context) or long-standing performance issues in Safari, but the result does not look nice unfortunately:

Screenshot 2024-10-30 at 16 22 42

I have tried to find remedies for that with various CSS tricks, etc., but to no end so far. It's not too disturbing if it's only text behind the pinned column, but if there's an image or something colorful, it looks broken. Haven't seen it in any other browser fwiw.

MBilalShafi commented 19 hours ago

pinned columns lagging behind the content.

Thanks for sharing this finding, while debugging I was trying to reproduce a similar performance hump on the column pinning because it also coincidently uses the gridRenderContextSelector binded with the store via the useGridSelector.

This finding connects with the exploration I did, it seems the issue has something to do with the virtualization. I'll explore this further to find the exact cause.