Do you know how nwmatcher is so fast?

domenic commented 10 months ago

I have a small idea for performance work, which may be good or may be bad. I wanted to share it with you in case it was helpful.

Basically, if you take a benchmark where nwmatcher is fast, have you looked at how it does its work? And how that work compares to dom-selector? Can dom-selector switch to nwmatcher's strategy? Doing so might require ugly special cases or fast paths, but maybe it is worth it.

Feel free to close this if it is not helpful.

asamuzaK commented 10 months ago

Does nwmatcher mean nwsapi? I haven't seen the source of nwmatcher. Assuming that you are talking about nwsapi, I will give my personal opinion below.

I think it's fast because it doesn't generate an AST from the given CSS selector string, but directly applies the regular expression to the string. However, as is typical of this line, it relies on negative matches [^xyz] which can potentially lead to reDos. https://github.com/dperini/nwsapi/blob/master/src/nwsapi.js#L81 If you try linting with e.g. eslint-plugin-regexp, I think you will get a lot of warnings?

Another reason for the speed may be that the Function() itself is generated from the string. But it doesn't seem like the input strings are being sanitized, which is also a concern. https://github.com/dperini/nwsapi/blob/master/src/nwsapi.js#L760

domenic commented 10 months ago

Sorry, yes, I think that nwmatcher was the old name.

I think it's fast because it doesn't generate an AST from the given CSS selector string, but directly applies the regular expression to the string.

When looking at profiles, I don't think the AST building showed up as a cost? Or maybe I don't know which parts of dom-selector are the AST building parts. The hot code was in _collectNodes, _collectNthOfType, _matchPseudoClassSelector, etc. The AST building is inside css-tree, right? Nothing in css-tree showed up in the profiles. And parser.js shows up as taking 0.1% of the total time.

However, as is typical of this line, it relies on negative matches [^xyz] which can potentially lead to reDos.

I feel like reDOS is a fake vulnerability made up to generate work for security scanning companies and lint plugin writers... for jsdom at least it is not in our threat model. Complaining that someone might be able to make your code slow via a regexp is not a good reason to use something slower than regexp :)

Another reason for the speed may be that the Function() itself is generated from the string. But it doesn't seem like the input strings are being sanitized, which is also a concern.

I think the input strings are being generated via nwsapi itself, so it should be fine.

Overall, looking at the profiles for nwsapi vs. dom-selector, I am just struck by how much more work dom-selector does and how different the call stacks look. I think it might be educational for you to focus on one case and just look at exactly how nwsapi does it and see if it's possible to adopt that approach to dom-selector.

nwsapi profile:

dom-selector profile:

It looks to me like dom-selector is crawling the DOM tree a lot more, maybe?

asamuzaK commented 10 months ago

And the reason why dom-selector is slow is that:

It loops as many times as number of AST for each element. For example, if you give div.classA#id[attr]:nth-child(2n+1), it will loop max 5 time against the same element (tag, class, id, attr, pseudo-class). I'm planning if it can be done in worker thread, but I haven't been successful yet.
To get the entry point element for querySelectorAll(), it rely on getElementsByXXX, but when converting the HTML collection of getElementsByXXX to an array, the conversion time increases to the number of elements. It's still better than traversing whole DOM tree though.

asamuzaK commented 10 months ago

The AST building is inside css-tree, right? Nothing in css-tree showed up in the profiles. And parser.js shows up as taking 0.1% of the total time.

True, but testing each AST against node takes time...

domenic commented 10 months ago

For example, if you give div.classA#id[attr]:nth-child(2n+1), it will loop max 5 time against the same element (tag, class, id, attr, pseudo-class).

Oh, wow. There must be a better way to do this, right? Like, by parsing out the selector ahead of time, you should be able to figure out the 5 checks you need to apply. Then, you can loop through each node once, and apply all 5 checks at once.

I think this is by far the most promising option.

when converting the HTML collection of getElementsByXXX to an array

Why do you need to convert it to an array? Can't you operate directly on the HTMLCollection? Sure, HTMLCollections are not as nice as arrays, but it doesn't seem worth paying the cost.

asamuzaK commented 10 months ago

I'm converting to array to simplify the iteration in the next step of processing. for (const i of arr) {}.

But:

for(let i = 0; i < l; i++) {
  let item;
  if (Array.isArray(arr)) {
   item = arr[i]
  } else {
    item = arr.item(i)
  }
}

Maybe it would be better to do above, I'll take the bench and compare. Thanks.

asamuzaK commented 10 months ago

Hmm... not a big difference.

Above: converting HTML collection Below: directly using HTML collection

benchmark @asamuzakjp/dom-selector v2.0.3-a.1

patched-jsdom querySelectorAll('.content') x 1,251 ops/sec ±3.26% (80 runs sampled)
patched-jsdom querySelectorAll('.content') x 1,295 ops/sec ±2.70% (84 runs sampled)

--

patched-jsdom querySelectorAll('div.container:not(.box)') x 21,967 ops/sec ±1.14% (89 runs sampled)
patched-jsdom querySelectorAll('div.container:not(.box)') x 21,743 ops/sec ±1.22% (89 runs sampled)

--

patched-jsdom querySelectorAll('.box + .box') x 20,745 ops/sec ±1.12% (87 runs sampled)
patched-jsdom querySelectorAll('.box + .box') x 19,222 ops/sec ±3.04% (86 runs sampled)

--

patched-jsdom querySelectorAll('.box ~ .box') x 8,018 ops/sec ±1.39% (88 runs sampled)
patched-jsdom querySelectorAll('.box ~ .box') x 8,181 ops/sec ±1.21% (92 runs sampled)

--

patched-jsdom querySelectorAll('.box > .block') x 2,523 ops/sec ±3.14% (86 runs sampled)
patched-jsdom querySelectorAll('.box > .block') x 2,724 ops/sec ±1.72% (90 runs sampled)

--

patched-jsdom querySelectorAll('.box .content') x 303 ops/sec ±4.98% (80 runs sampled)
patched-jsdom querySelectorAll('.box .content') x 332 ops/sec ±2.41% (81 runs sampled)

--

patched-jsdom querySelectorAll('.box:first-child ~ .box:nth-of-type(4n+1) + .box .block.inner > .content') x 557 ops/sec ±2.84% (84 runs sampled)
patched-jsdom querySelectorAll('.box:first-child ~ .box:nth-of-type(4n+1) + .box .block.inner > .content') x 576 ops/sec ±4.39% (80 runs sampled)

--

patched-jsdom querySelectorAll('.box:first-child ~ .box:nth-of-type(4n+1) + .box .block.inner:has(>.content)') x 379 ops/sec ±2.95% (81 runs sampled)
patched-jsdom querySelectorAll('.box:first-child ~ .box:nth-of-type(4n+1) + .box .block.inner:has(>.content)') x 430 ops/sec ±2.14% (83 runs sampled)

benchmark sizzle-speed @asamuzakjp/dom-selector v2.0.3-a.1

patched-jsdom querySelectorAll x 0.56 ops/sec ±3.35% (6 runs sampled)
patched-jsdom querySelectorAll x 0.56 ops/sec ±2.70% (6 runs sampled)

--

patched-jsdom querySelector x 0.65 ops/sec ±1.81% (6 runs sampled)
patched-jsdom querySelector x 0.66 ops/sec ±2.80% (6 runs sampled)

asamuzaK commented 10 months ago

How about use nwsapi only if selectors are supported and use dom-selector for the rest? Not tested, but maybe something like below.

// Filter selectors that nwsapi does not support or cannot handle correctly:
// namespaced selector e.g. ns|E, pseudo-element e.g. ::slotted,
// :focus, :indeterminate, :placeholder-shown, :scope,
// :dir(), :lang(), :has(), :is(), :where(),
// :not(complex selector), :not(:pseudo-class), :not(attr)
// :nth-child(an+b of selector), :nth-last-child(an+b of selector),
// :host, :host(), :host-context(),
// [attr i], [attr s]
if (!/\||:(?::|focus|indeterminate|placeholder|scope|dir|lang|has|is|where|not\(\s*(?:\*|[\.#]?[\da-z-_]+(?:[\.#][\da-z-_]+)*)(?:\s*,\s*(?:\*|[\.#]?[\da-z-_]+(?:[\.#][\da-z-_]+)*))*\s*\)|nth-(?:last-)?child\(.+\sof.+\)|host)|\s[is]\s*\]/i.test(selector)) {
  // nwsapi
} else {
  // dom-selector
};

domenic commented 10 months ago

I'm not very excited about that approach; it seems likely fragile where developers might end up switching between the engines non-transparently. I'm hopeful we can have a selector engine in jsdom which uses a fast-by-design architecture.

Did you make any progress on the approach I suggested above? Or is it not good?

Oh, wow. There must be a better way to do this, right? Like, by parsing out the selector ahead of time, you should be able to figure out the 5 checks you need to apply. Then, you can loop through each node once, and apply all 5 checks at once.

asamuzaK commented 10 months ago

I do not think it's fragile. I have submitted a new pull request. https://github.com/jsdom/jsdom/pull/3678 Performance remains high for selectors that nwsapi supports. Frankly speaking, I don't think nwsapi can support Selector level 4, so the scope of using nwsapi will not expand any further.

Did you make any progress on the approach I suggested above? Or is it not good?

It's running in a loop, but it's set to fail fast, so as much optimization as possible has already been done.

domenic commented 10 months ago

It's possible nwsapi cannot support selectors level 4. But it's clear that it's possible to create fast software that does, since we have 3 browser engines all implementing the spec. So I am not willing to incorporate software with fundamentally slow architecture, when we know that fast is possible.

asamuzaK commented 10 months ago

Got it. Unfortunately, I don't think dom-selector can be a help then.

asamuzaK commented 10 months ago

I would like to leave a feedback.

The current issue that jsdom wants to solve is:

Extended support for selector level 4

Right?

And what you can use now as tools are:

nwsapi: very fast, but only supports selector level 3
dom-selector: slow, but can handle selector level 4 so so.

There are currently no tool that can support selector level 4 with high performance (at least, none have been found). If so, wouldn't it make more sense to use the tools currently available than to wait for what you don't have and leave the current issue unsolved? And once "a tool that support selector level 4 with very fast performance" comes out, wouldn't it be enough to abandon nwsapi and dom-selector at that time?

I don't care whether dom-selector will be included or not. I just wish jsdom to be more convenient because I rely on it heavily.

domenic commented 9 months ago

The current issue that jsdom wants to solve is:

Extended support for selector level 4

Right?

This is not really accurate.

jsdom is a volunteer project, with many competing priorities. Given limited maintainer bandwidth, one of the main goals is minimizing maintenance burden, via solid architecture. A big part of this is minimizing dependencies. Having two selector engines instead of one is problematic from this point of view. Having a selector engine with a bad-performance-by-design architecture, in particular, is going to be a significant burden going forward for us. (As we've seen from the time spent churning on jsdom 23.2.0 to 24.0.0.) This is true even if that engine only kicks in sometimes. Actually, that's especially true: it would mean the maintainers need to know exactly what causes the different engines to kick in, e.g. to triage whether a given bug report needs to be reported against selector engine 1 vs. 2, or maybe versus the logic that determines which selector engine to dispatch to.

We'd rather maintain tight control over our dependencies, ensuring that they're all designed to work well with jsdom, including from a performance perspective. Maintainability is more important than features.

That's why I opened this issue, to encourage you to adopt some of the techniques nwsapi (or browsers themselves) use. I suspect that if you look at exactly how nwsapi accomplishes its speed, e.g. by tracing through all code paths taken for the simpler selectors, you can find ways to incorporate that into dom-selector's architecture. This seems especially likely to me since per the performance flamegraphs discussed above, the parsing is not a bottleneck; it's all about faster DOM retrieval and traversal techniques.

asamuzaK commented 9 months ago

I don't agree with some things you say, but I understand.

As an alternative, I decided to include nwsapi in dom-selector. Performance should be maintained as before in most cases, because nwsapi handles what it can handle. See Performance However, please note that the performance issue is not completely resolved.

It's up to you whether you want to add dom-selector again to jsdom. If you say a go, I will send a PR.

asamuzaK commented 9 months ago

nwsapi is not good at handling child and/or descendant selectors such as .foo > .bar and .foo .bar in querySelector() and querySelectorAll(). By handling that part with dom-selector and complementing each other, the performance improved. https://github.com/asamuzaK/domSelector/actions/runs/7898913102/job/21557270583#step:5:76 https://github.com/asamuzaK/domSelector/actions/runs/7898913102/job/21557270583#step:5:100

It is also noticeable in the sizzle benchmark. https://github.com/asamuzaK/domSelector/actions/runs/7898913102/job/21557270583#step:6:10 https://github.com/asamuzaK/domSelector/actions/runs/7898913102/job/21557270583#step:6:16

However, the performance will be worse than before for selectors with nested logical combinators, such as :not(:is(.foo, .bar)). nwsapi has bugs in :is(), :where() and :not(). In such a situation, it is almost impossible for me to find a way for nwsapi to handle it correctly using only regular expressions or other convenient methods.

Fixed above.

asamuzaK / domSelector

Do you know how nwmatcher is so fast? #38

benchmark @asamuzakjp/dom-selector v2.0.3-a.1

benchmark sizzle-speed @asamuzakjp/dom-selector v2.0.3-a.1