Optimize menu performance

afshin commented 2 years ago

Problem

In https://github.com/jupyterlab/jupyterlab/issues/9757#issuecomment-999915671 @krassowski writes:

Analysis:

I am currently focusing on the delay visible when moving mouse out of the menu and the over again (this is universal problem when the test > notebook is open and affects all menus, including menu bar and context menu):

After lower than expected improvements in https://github.com/jupyterlab/lumino/pull/273 and failure to eliminate this even when commenting > out handlers for mousemove, mouseenter and mouseleave I went a step further and removed event listeners from lumino's Menu for > mousemove, mouseenter and mouseleave altogether. To my surprise the issue did not go away - it was still present when looking in the > profiler; we don't see any layout shifts but a lot of long Recalculate Style tasks; those are preceded by Schedule Style Recalculation > which helps to narrow down where it is coming from.

The profiler does not show why those come to be - just that they follow browser-native HitTest (and precede the actual event emission). > The raw JSON profile file hints that the HitTest goes to the notebook cell (not unexpectedly, this is where my cursor lands after leaving > the menu) and also mentions intersection observer (IntersectionObserverController::computeIntersections) kicking in from time to time (due > to lazy notebook rendering I suppose):

The following experiment suggests that this might be triggered by HitTest when mouse leaves the menu (and then HitTest for some weird > reason causes style recalculation on Chromium/Blink):

on top of initial changes in lumino#273 add a <div class="test"> with > z-index=z-index of menu - 1

make that div position:absolute and cover entire viewport (height:100%; widht: 100%) so that it occludes the rest of the DOM (except > for the menu)

optionally add background: grey; opacity: 0.5 just to see the div

see that moving mouse out of the menu and back again does not cause the delay anymore

And this time there is no style recalculation anymore nor scheduling of it:

Now the behaviour of Hit test is standardized by W3C: https://www.w3.org/wiki/Hit_Testing, but it is not well understood topic when it > comes to performance optimization. There is one [question on SO](https://stackoverflow.com/questions/41830529/> optimizing-native-hit-testing-of-dom-elements-chrome) which concerns the Hit Test taking a lot of time itself, but in our case the problem > is not the presence of Hit Test but that it sometimes schedules style recalculation.

So why a Hit Test might schedule a style recalculation even though nothing changed in the DOM? Why does it not happen every time but just > some times? Well :hover pseudo selector might be the answer. We can test that by adding:
<style>.test:hover{background:red!important}</style>
This indeed restores the Schedule Style Recalculation and Recalculate Style tasks (although this time style recalculation is super fast > because only our test div is affected):

While I only tested on Chromium-based browsers, I would not be surprised if other engines are affected too.

Potential solution

In a further comment, @krassowski proposes

Proposed solution

Could we add a <div> occluding background DOM when menu is open? The div would be effectively transparent (this could be implemented e.g. > with opacity: 0.01) and clicking on it would detach it from DOM and close the menu. This means that when user has a menu open and clicks > outside, e.g. on a link, it would not open (unless they click again after menu gets closed).

The UX would be consistent with how native menus (both context menus and application-level menus) work in Chrome (on Ubuntu), in GNOME and I > suspect in many other applications/environments: neither hover nor click works on background elements when menu is open. Firefox does a bit > of both: clicks do not work on background elements when context menu is open, but hover styles do; clicks do work in Firefox though when the > topbar menu is open (as do hover styles).

And it be super fast regardless if the number of nodes in the DOM is 6000 or 600000.

Edit: just in case if someone was wondering, setting pointer-events: none; on the jp-LabShell does not solve the issue.

3coins commented 2 years ago

@krassowski This looks interesting to me, I can try your proposed approach, let me know if you are already working on this issue.

afshin commented 2 years ago

I checked with @krassowski and he said he wasn't working on this. I'll assign it to you. Please feel free to hand it off back to me or someone else if you wish. I initially thought I'd make an attempt at it, but I'd be happier for you to go for it!

afshin commented 2 years ago

@3coins also please take a look at the original thread where this issue comes from because it has other approaches that were suggested as well: https://github.com/jupyterlab/jupyterlab/issues/9757#issuecomment-999915671

krassowski commented 2 years ago

More fundamentally we are interested in reducing the "Recalculate Style" time for JupyterLab when a lot of DOM nodes are present. The virtual windowing (https://github.com/jupyterlab/jupyterlab/pull/12554) will help a lot by reducing the amount of nodes in the DOM.

An exploration that I would suggest for Lumino 2.0 is to see if we can narrow down the cause of style recalculation taking such a long time. This Google Dev article says that the cost is roughly 50:50 between selector matching and style computation. Is this because of the styles in JupyterLab, or is it because any styles in Lumino? If there are any styles in Lumino which need changing (e.g. to follow the BEM model), we should strive to adopt those in Lumino 2.0 as those (most likely) will be breaking changes.

I don't know how to profile what contributes to "Recalculate Style" cost in Chrome. The last time I looked there was no support for it in the Chrome profiler. I know that the Firefox Nightly has more granularity and it could help here. If everything else fails, we could just disable styles one-by-one and see what is the impact on performance.

krassowski commented 2 years ago

Another possibility is to reconsider why is the "Recalculate Style" triggered. It is only triggered when there is a DOM mutation which could cause style invalidation (or browser wrongly thinks it could cause style invalidation). Currently we are attaching menu to the body and then removing it which is a blatant DOM mutation.

Would it make a difference if we always had a menu node in the body instead of adding and removing it? Maybe not because we still would be modifying its position, but maybe yes because the browser's heuristic would recognise that it is an innocuous change?

krassowski commented 2 years ago

Would it make a difference if we always had a menu node in the body instead of adding and removing it? Maybe not because we still would be modifying its position, but maybe yes because the browser's heuristic would recognise that it is an innocuous change?

Of note, we already do that with completer in core, and completer does not have this problem. Maybe worth more exploration than I initially thought?

3coins commented 2 years ago

Synced up with @krassowski last week about this issue to get some more clarity. I spent some time last week replicating the issue locally on JL3, visually I wasn't able to see any perceived latency with the attached notebook open. I also attempted to investigate the performance within Chrome and Firefox, and did notice some "long tasks" pointing to the "recalculate style". My next step would be to download Firefox nightly and see if I can find any new information regarding "Recalculate Style". If this does not work, I am going to look into setting some automation to remove styles one-by-one and measure performance.

krassowski commented 2 years ago

@3coins are you still working on this one? If not, I could take it up. The suggested workaround with a transparent div would also help with an issue I noticed of the menu not getting hidden when clicking on some widgets (notably tabbar, strangely it only happens 3/4 times):

menu-does-not-hide-sometimes

3coins commented 2 years ago

@krassowski Plz feel free to take over, I have been busy with other things. Thanks.

krassowski commented 2 years ago

The problem with hover boundary between menu and other widgets was narrowed down to an issue in Chromium style invalidation strategy and and worked around downstream in https://github.com/jupyterlab/jupyterlab/pull/13159. This helped by reducing style recalculation cost by orders of magnitude in Chromium.

Another problem we can tackle (highlighted in the original issue) is slow menu switching. While it is much better with the patch, it still performs hit-testing twice (and style recalculation thrice!). This affects all browsers, but after applying the patch from 13159 it is only a noticeable problem (0.5s) on a page with 300k nodes (100k of each: span, svg and div) and 6x slowdown.

Screenshot from 2022-10-04 00-03-03

A likely solution would be to either to:

re-use the same DOM node for the top-level Menu in menu-bars (re-populating it as needed) to avoid the detach-attach sequence, or
make sure that we do not switch back-and-forth between querying and modifying DOM

in both cases it means creating a special method for switching the menu in the menu bar, rather than relying on existing sequence which is:

On mouse move reaching new menu bar entry (hit test)

set focus in the menubar using activeIndex (DOM modification)
close existing menu (DOM modification)
get position for new menu from menubar item via getBoundingClientRect (DOM query)
open new menu:
- populate new widget
- attach to DOM (DOM modification)
- calculate menu size via getBoundingClientRect
- if the menu position + size suggests it would not be visible, move it so it is visible

The ideal order of operations to maximise performance is:

get position for new menu from menubar item (DOM query)
estimate new menu size using styles of displayed menu (DOM query)
set focus in the menubar using activeIndex (DOM modification)
close existing menu via getBoundingClientRect (DOM modification)
open new menu:
- if the menu position + estimated size suggests it would not be visible, move it so that it is visible (DOM modification)
- attach to DOM (DOM modification)

This way we would avoid spurious the reflows.

krassowski commented 2 years ago

I also experimented with moving all menu nodes from document.body to a single div (with css contain) or into shadow DOM (since that would be more practical than moving other widgets to shadow DOM, as extensions do not generally provide custom styles for menu) but the results so far were inconclusive.

Some blockers are: - `contain: strict` requires us to always position the div at the end of `document.body` which sounds reasonable until you realise that every time completer of other hover box is attached in JupyterLab it lands at the end of `document.body` - attaching styles is problematic and will require some webpack magic beyond the raw text loader I used since it conflicts with the css-loader.

If someone wants to test it out or carry on, see https://github.com/krassowski/lumino/commit/22367407e146fece16244c795d406f77a9fa6352

krassowski commented 2 years ago

I performed initial analysis of the remaining contributors to the cost of style computation and layout, in plain JupyterLab (latest development version - 4.0.0a29), on Chromium with a notebook containing 300k nodes (100k svg, div and span):

opening menu from the menubar can benefit from improvement by disabling CSS stylesheets:
- tabbar.css (lumino): 85.2% faster
- menubar.css (lumino): 4% faster (but of course we need that one)
- menus.css (JupyterLab): 3.9% faster
opening menu from the menubar can benefit from improvement by disabling individual CSS rules:
- .lm-TabBar: 78.3% faster
- .lm-MenuBar-content 3.8% faster
- .lm-MenuBar.lm-mod-active .lm-MenuBar-item.lm-mod-active 3.3% faster
- .fa-ul > li 1.1% faster
  - font-awesome stylesheet includes 1500 rules, roughly 40% of all CSS rules and is not used by JupyterLab itself
  - I proposed removing it before but it did not have strong support
  - preliminary results from other actions also indicate ~1% slowdown due to the entire stylesheet, but it is probably not bad enough to warrant breaking old notebooks by removing it;
  - we could try only enabling it if a notebook includes a class with fa- prefix.
switching menu in the menubar can benefit from improvement by disabling CSS stylesheets:
- tabbar.css (lumino) 82.4% faster
- menus.css (JupyterLab): 4.1% faster
- menubar.css (lumino): 3.3% faster

The most promising styles to act on are:

.lm-TabBar {
  display: flex;
  -webkit-user-select: none;
  -moz-user-select: none;
  -ms-user-select: none;
  user-select: none;
}

and

.lm-MenuBar-content {
    margin: 0;
    padding: 0;
    display: flex;
    flex-direction: row;
    list-style-type: none;
}

The first one may corresponds to another performance issue in Chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=605419, see this comment from 2017:

Applying user-select:none on a top element having much children and clicking on it can cause performance issue because Chrome search 'clickable' element traversing the tree.

If it is significant problem, please submit as new issue (maybe titled "clicking on user-select:none is slow"?) since this is standerdize tracking issue.

however, I could not find such an issue so probably it was not acted on.

Interestingly both also include display: flex which supposedly is faster than old layouts but I also know that lists are slow in Chromium so maybe this is not the case here.

fcollonval commented 2 years ago

Thanks a lot for the analysis and sharing the results.

jupyterlab / lumino

Optimize menu performance #329

Problem

Potential solution

Proposed solution