jquery / sizzle

A sizzlin' hot selector engine.
https://sizzlejs.com
Other
6.29k stars 951 forks source link

Sibling query with context doesn't work correctly if there is multiple elements with same ID on page #480

Closed OlegWock closed 2 years ago

OlegWock commented 2 years ago

I'm writing browser extension which pulls data from active page (Amazon particularly) and today I stumbled upon this issue. When I'm trying to make query with context, it returns not only results from context, but rather from parent node of context. This seems to be because page contains multiple elements with same ID, because I use sibling selector and because Sizzle tries to do optimizations based on these IDs: https://github.com/jquery/sizzle/blob/main/src/sizzle.js#L322-L356

I'm attaching screenshot to demonstrate it Screen Shot 2021-12-07 at 17 38 37

Here, $el is one of 30 elements with id gridItemRoot and when I'm trying to make query against its children I get 30 results (one from each element with gridItemRoot id) instead of one.

I'm not really sure if it's really a bug. We all know it isn't a good practice to include multiple elements with same ID on page. But in the end, browser itself handles this well (seen on screenshot), so to me it looks rather than bug. But I don't fully understand whole trick with this optimizations mentioned earlier, especially this line https://github.com/jquery/sizzle/blob/main/src/sizzle.js#L333 I'm sure it's made intentionally for some purpose, but with my current knowledge I can't say what purpose. So please let me know if you really consider it a bug or it's rather Amazon's fault for making bad layout 😅

If you believe it's bug I'd be happy to make pull request

mgol commented 2 years ago

Thanks for the report.

Valid HTML documents require IDs to be unique. I understand there are various APIs that work with bad structure, like querySelectorAll, but jQuery has assumed this rule is followed from the beginning.

Querying by IDs is a very common thing and getElementById is faster than querySelectorAll so there was a conscious decision to choose the faster option and not support elements with repeated IDs.

At this point of life of Sizzle, it's not likely anything will change here. You could argue for a change in jQuery 4.0 which will not include Sizzle but I'd be surprised to not see speed difference since getElementById can just stop scanning the document when it hits a matching ID. Not to mention that it doesn't need to parse the provided string which querySelectorAll needs to do.

We would definitely need solid benchmarks before changing anything here.

OlegWock commented 2 years ago

@mgol I see. Since it's by design I have no intentions to complain 🙂

Thank you for detailed reply