jquery / sizzle

A sizzlin' hot selector engine.
https://sizzlejs.com
Other
6.29k stars 951 forks source link

support.getById creates a live NodeList that slows down all DOM mutations on the page #402

Closed ehsan closed 7 years ago

ehsan commented 7 years ago

Hi there,

I encountered an issue while optimizing Firefox for the Speedometer V2 benchmark that stems from this code in the sizzle library. This code, as far as I can tell, exists in the latest versions of the jQuery and EmberJS libraries, and is run from the global scope of these libraries, which means it will affect all pages that include these libraries.

The problem stems from the call to the getElementsByName API. This function returns a live NodeList object, which means that while the object returned by the function is alive, the browser engine needs to perform extra work in order to keep it updated, so that if, for example, a new DOM node is added to the document which matches the name passed to the getElementsByName call, it will magically appear in the NodeList object.

This code isn't assigning the return value of getElementsByName to anything, which makes the created NodeList object have a lifetime that isn't known to the web page. In practice, this object can live around for a while. Since this call is typically made at page load time (due to it being made from the global scope of the jQuery/Ember minified scripts) it can impact the performance of all DOM manipulations that the page performs as it is loading.

I have filed a bug on the Firefox side about making this faster. But there is only so much that we can do about it, I believe, and besides, I think this is something that would impact other browser engines as well. It would really be nice if Sizzle could either avoid doing this altogether, or at least avoid doing this for non-IE browsers.

mgol commented 7 years ago

Thanks for the report.

This code isn't assigning the return value of getElementsByName to anything, which makes the created NodeList object have a lifetime that isn't known to the web page.

Since the collection isn't assigned to anything, just one primitive property is read immediately after creation why doesn't the browser discard the collection and all its associated behavior immediately?

ehsan commented 7 years ago

This code isn't assigning the return value of getElementsByName to anything, which makes the created NodeList object have a lifetime that isn't known to the web page.

Since the collection isn't assigned to anything, just one primitive property is read immediately after creation why doesn't the browser discard the collection and all its associated behavior immediately?

Memory management of DOM objects in browser engines can be quite sophisticated. I can't speak in much detail to what other browser engines do here precisely, but in Gecko many of the DOM objects that we have including the NodeList object in question here can form reference cycles with other DOM and JavaScript objects that live in the application heap. In order for us to be able to correctly reclaim the memory of such objects in a cycle when nothing else outside of the cycle is pointing to them, we have an internal component called the Cycle Collector, which periodically scans our object graph looking for disjoint cycles to free them up.

Our JavaScript JIT is able to treat method calls that do not use the return value in any way to exempt them from entering this complex machinery, but here due to the fact that the .length getter is used on the return value, that heuristic won't work and as far as our JS engine is concerned, this method call's return value is used in the same way as if it were assigned to a variable. :-( Of course, it really isn't assigned to anything, so there isn't any JS references to it after the call and the getter finish running. And on the C++ side, there won't be any (non-cyclic at least -- I haven't fully checked this but I believe what I'm saying is mostly accurate!) C++ references to the object either, but because the object can in theory participate in these types of cross-language reference cycles, it is up to the garbage collector and the cycle collector to run and note that this object really has no reason to live, and to kill it, and those things don't have any predictable running frequency.

And to make things even more fragile, everything above is Gecko's current implementation details and it may change in the future. :-) But in practice pages can't really depend on the underlying objects really dying immediately when it would normally "make sense". This is normally fine since most objects don't really have much of an impact just due to them living besides consuming memory, but these live NodeLists have the unfortunate characteristic of imposing a performance penalty on all DOM modifications to the page for as long as they are living, even if the JS code on the page has no way of accessing them any more.

dmethvin commented 7 years ago

I'd like to know which browsers need this feature test, there's no // Support: above it so perhaps it's so old it's not needed anymore? If I had to guess I'd say it's IE7/8 but I don't have one handy.

@ehsan the dilemma with fixing this is that no matter what we do in the current Sizzle or jQuery 3.x, the older versions will dominate just about forever. If this is just a synthetic benchmark issue that may not matter (Speedometer will look better) but if it's a real perf issue an update to the latest version won't help the web much.

ehsan commented 7 years ago

I'd like to know which browsers need this feature test, there's no // Support: above it so perhaps it's so old it's not needed anymore? If I had to guess I'd say it's IE7/8 but I don't have one handy.

Seems like IE<10?

https://github.com/jquery/sizzle/blob/eabce51ea7360c4507da2eeaec2633378de4ec8d/src/sizzle.js#L624

@ehsan the dilemma with fixing this is that no matter what we do in the current Sizzle or jQuery 3.x, the older versions will dominate just about forever. If this is just a synthetic benchmark issue that may not matter (Speedometer will look better) but if it's a real perf issue an update to the latest version won't help the web much.

Yeah definitely. Because of this, since I discovered this issue we have been thinking about how we can change Gecko to alleviate this issue on our side. (As far as I'm concerned, the impact of this on Speedometer is my least worry, I am much more concerned about the performance implications of it on real websites -- Speedometer is a fairly synthetic benchmark, and the amount of the time that this shows up in the profiles of Speedometer for Firefox isn't all that much; I just mentioned Speedometer to give it the credit for uncovering the issue!)

The good news is that so far we have come up with two fixes that we are going to make to Gecko to improve the handling of this Gecko for older versions of Sizzle and jQuery in the wild which will take a long amount of time to get updated:

I believe that with these two bugs, Gecko should do a fairly good job at making the underlying NodeList object created here inert and not receive DOM mutation notifications (and therefore suck out performance for pages as they mutate the DOM) pretty soon after this code runs. Hopefully both of these fixes will be part of Firefox 56.

mgol commented 7 years ago

@ehsan Thanks for detailed descriptions of Firefox inner workings!

@dmethvin While I agree fixing this in Sizzle won't help the majority of cases, this doesn't mean we shouldn't do it; in this way we could argue against any improvement in the library.

I don't have anything against guarding the test with an additional document.attachEvent test to help with performance in modern browsers. The only thing that we should keep in mind is that we plan to deprecate Sizzle in favor of an engine built-in in jQuery that will rely on querySelectorAll and will handle qSA bugs via selector rewriting instead of manual traversal. @gibson042 do you think it's likely we'll have something like that ready in any foreseeable future?

gibson042 commented 7 years ago

@ehsan Would the NodeList be immediately collected if we assigned the result to a local variable?

docElem.appendChild( el ).id = expando;
var matches = !document.getElementsByName || document.getElementsByName( expando );
return !matches.length;
timmywil commented 7 years ago

@ehsan ping

Krinkle commented 7 years ago
  • Making sure that the NodeList object stops receiving DOM mutation notifications as soon as the cycle collector determines the object has no reason to be alive any more (since we use deferred deletion for these types of objects): Mozilla bug 1376936.
  • Making sure that the NodeList JS objects can be allocated in the JS "nursery" (which is the small JS heap used by the incremental GC for allocating objects that are known to be short-lived). This allows us to quickly GC the object that Sizzle creates just to read the .length from, which will allow the previous bug fix to kick in very quickly after the getElementsByName() call has been performed: Mozilla bug 1376954.

Both Mozilla bug 1376936 and Mozillla bug 1376954 were resolved and will be released in Firefox 56 (stable to ship September 26, 2017).

gibson042 commented 7 years ago

In light of the fixes landing in Firefox, I'm going to close this. But we'd be willing to revisit if any new information comes in (such as a response about the local variable fix or an analogous issue in another browser).