dperini / nwsapi

Fast CSS Selectors API Engine
MIT License
105 stars 36 forks source link

The `:matches(…)` pseudo‑class has been renamed to `:is(…)` #28

Closed ExE-Boss closed 2 years ago

ExE-Boss commented 4 years ago

The only difference is that :is(…) follows @media query‑like invalidation:

dperini commented 4 years ago

@ExE-Boss thank you for your time and suggestion, I always appreciate contributions. We are fully aware of these API name changes in current drafts but they are still drafts.

I have been discussing this API name change with the maintainers of jsdom and it was decided to let this as is, thus follow what is in real browser as this is the main objective of jsdom (which makes up 90% usage of the nwsapi project).

I do like the new Selectors API name and in regard to CSS I like the new semantic flow that :is() gives coupled with the :not() selector and I am prepared for the change, just waiting for things to settle. In nwsapi code It is just a matter of adding is() as the default API name and change matches() to be an alias kept for backward compatibility.

As for other reasons to wait implement this change I can say that none of the browsers I can test has implemented this new API name change yet. The name of this method has remained unchanged so every browser implementation we know is still using matches(...).

For the above reasons I do not agree about this being an issue in the current code.

NOTE: a new version with code cleanup and some extra speed improvements is on its way. now.

ExE-Boss commented 4 years ago

In nwsapi code It is just a matter of adding is() as the default API name and change matches() to be an alias kept for backward compatibility.

If you only do that, then :is(foo, :maybe-unsupported) will throw an error instead of matching all <foo> elements. (I know, because I already tried that)

I don’t actually know where to insert the code for the better @media‑style error handling within nwsapi’s implementation.

dperini commented 4 years ago

@ExE-Boss I don't know yet if the final "Recommendations" will include those rules and, given the history of tentative changes that were dropped in the past I am a bit reluctant to start implementing in advance. Even if I did that the change you suggest, it will probably be against jsdom rules of not being implemented in browser. I would have to ask @domenic and @Zirro to confirm or negate this position.

It seems to me you are saying that:

document.documentElement.matches('html', ':maybe-unsupported');

would return 'false' ... instead in my tests it does return 'true' ... if I did understand what you are saying. Please correct me if I am wrong or if I did misunderstand you.

ExE-Boss commented 4 years ago

That would return true, because arguments is:

[
    "html", // only the first argument contains a selector
    ":maybe-unsupported"
]

What you probably meant was:

document.documentElement.matches('html, :maybe-unsupported');

Which will throw throws a SyntaxError if :maybe‑unsupported isn’t supported.


This should however return true in engines that support :is(…):

document.documentElement.matches(':is(html, :maybe-unsupported)');
dperini commented 4 years ago

@ExE-Boss my bad ... You are correct matches(...) only accepts one selector string. So when changing the name of the API I should have a look to that difference.

Unfortunately I don't have a platform where I can test this behaviour yet. If you happen to have one to suggest I could start exercising some adjustment.

Thank you for the head up on this W3C functionality change of this API method.

ExE-Boss commented 4 years ago

The functionality of the API method isn’t changing.

This is about the :is(…) pseudo‑class.

danburzo commented 4 years ago

Firefox 78 has landed support for the :is() and :where() pseudo-classes, and there are a few (1, 2, 3) web platform tests for them. The upcoming Safari 14 presumably enable them by default, and Chrome is still working on them. Would it be a good time to discuss their implementation in nwsapi?

(I found this thread after logging an issue on jsdom about the :matches() pseudo-class.)

dperini commented 4 years ago

@danburzo yes, changing that naming was the plan for "nwsapi" too. Was just waiting for browsers to show up their implementation.

danburzo commented 4 years ago

Awesome! Please let me know if there's anything I can help with.

dperini commented 4 years ago

@danburzo yes ... thank you for offering help. It is appreciated. Testing the changes I will commit and executing tests would be great. Also checking the problem with the "matches() > is()" renaming outlined by @ExE-Boss would help. I already made this change and the suggested "nodeName > localName" replacement as suggested in issue #37 I believe tomorrow I will be pushing these two commits.

danburzo commented 4 years ago

I'll be happy to test the changes.

In regards to the problem raised by @ExE-Boss, if we're referring to the :is() and :where() pseudo-classes being more permissive with invalid selectors, unfortunately it seems Firefox has shipped without it.

In fact, in Firefox this sequence:

document.querySelector('div:is(div, :bogus)');

...throws an error instead of ignoring :bogus. I'm not sure where the proposal stands in light of this, but I'm going to ask on the CSSWG thread directly.

dperini commented 4 years ago

@danburzo @ExE-Boss I pushed a commit for this change. I will appreciate a review on it's functionality compared to that of recent Firefox as it is the only browser that already made the change. In this case the objective is mimicking the same functionality found on current browsers implementation rather than following the specification to the letter. As soon as other browsers implement this, we can level up our implementation. Thank you.

ExE-Boss commented 4 years ago

@dperini

I pushed a commit for this change.

No, you didn’t, as the :is(…) pseudo‑class is still unrecognised: https://github.com/dperini/nwsapi/blob/44a937dbeb243459f13ae7d36f1ed1413884b2c4/src/nwsapi.js#L79https://github.com/dperini/nwsapi/blob/44a937dbeb243459f13ae7d36f1ed1413884b2c4/src/nwsapi.js#L1006-L1008

danburzo commented 4 years ago

Hi @dperini, I've poked around the code a bit and, roughly, this is the desired behavior, based on the code prior to your commit:

diff --git a/src/nwsapi.js b/src/nwsapi.js
index 9497b72..6d97721 100644
--- a/src/nwsapi.js
+++ b/src/nwsapi.js
@@ -76,7 +76,7 @@
   GROUPS = {
     // pseudo-classes requiring parameters
     linguistic: '(dir|lang)\\x28\\s?([-\\w]{2,})\\s?(?:\\x29|$)',
-    logicalsel: '(matches|not)\\x28\\s?([^()]*|[^\\x28]*\\x28[^\\x29]*\\x29)\\s?(?:\\x29|$)',
+    logicalsel: '(is|where|not)\\x28\\s?([^()]*|[^\\x28]*\\x28[^\\x29]*\\x29)\\s?(?:\\x29|$)',
     treestruct: '(nth(?:-last)?(?:-child|-of-type))(?:\\x28\\s?(even|odd|(?:[-+]?\\d*)(?:n\\s?[-+]?\\s?\\d*)?)\\s?(?:\\x29|$))',
     // pseudo-classes not requiring parameters
     locationpc: '(link|visited|target)\\b',
@@ -1004,34 +1004,18 @@
             }

             // *** logical combination pseudo-classes
-            // :matches( s1, [ s2, ... ]), :not( s1, [ s2, ... ])
+            // :is( s1, [ s2, ... ]), :not( s1, [ s2, ... ])
             else if ((match = selector.match(Patterns.logicalsel))) {
               match[1] = match[1].toLowerCase();
               switch (match[1]) {
-                case 'matches':
-                  if (not === true || nested === true) {
-                    emit(':matches() pseudo-class cannot be nested');
-                  }
-                  nested = true;
+                case 'is':
+                case 'where':
                   expr = match[2].replace(REX.CommaGroup, ',').replace(REX.TrimSpaces, '');
-                  // check nested compound selectors s1, s2
-                  expr = match[2].match(REX.SplitGroup);
-                  for (i = 0, l = expr.length; l > i; ++i) {
-                    expr[i] = expr[i].replace(REX.TrimSpaces, '');
-                    source = 'if(s.match("' + expr[i].replace(/\x22/g, '\\"') + '",e)){' + source + '}';
-                  }
+                  source = 'if(s.match("' + expr.replace(/\x22/g, '\\"') + '",e)){' + source + '}';
                   break;
                 case 'not':
-                  if (not === true || nested === true) {
-                    emit(':not() pseudo-class cannot be nested');
-                  }
                   expr = match[2].replace(REX.CommaGroup, ',').replace(REX.TrimSpaces, '');
-                  // check nested compound selectors s1, s2
-                  expr = match[2].match(REX.SplitGroup);
-                  for (i = 0, l = expr.length; l > i; ++i) {
-                    expr[i] = expr[i].replace(REX.TrimSpaces, '');
-                    source = compileSelector(expr[i], source, false, callback, true);
-                  }
+                  source = 'if(!s.match("' + expr.replace(/\x22/g, '\\"') + '",e)){' + source + '}';
                   break;
                 default:
                   emit('\'' + selector_string + '\'' + qsInvalid);

That is:

I'm not sure if these changes are all that's needed, since this is my first contact with the codebase, but in some quick tests it seems to be working correctly.

Please note that it's only the CSS pseudo-class that changed from :match() to :is(). The DOM method Element.prototype.matches remains the same.

dperini commented 4 years ago

@danburzo @ExE-Boss my bad, sorry ... there was a big misunderstanding on my side. I thought the API name was changed from .macthes() to .is() while I see now only the CSS pseudo-class names have changed. I removed the wrong commit and I will prepare for the right one.

dperini commented 4 years ago

@danburzo about nesting :not() pseudo-classes ....

(was there a specific reason why it was prohibited in the existing codebase?);

Yes the reason is here and possibly still valid: https://developer.mozilla.org/en-US/docs/Web/CSS/:not#Description

dperini commented 4 years ago

@danburzo yes, your changes to the code seems to be correct. I believe we should leave ':matches' as an alias for backward compatibility so the RegExp would be:

logicalsel: '(is|where|matches|not)\\x28\\s?([^()]*|[^\\x28]*\\x28[^\\x29]*\\x29)\\s?(?:\\x29|$)',

and similarly, later:

case 'is':
case 'where':
case 'matches':

I don't think we should remove :matches completely. Should we ?

danburzo commented 4 years ago

The Selectors Level 4 specification has relaxed some of these requirements:

19.4. Changes since the 23 August 2012 Working Draft Significant changes since the 23 August 2012 Working Draft include:

  • Released some restrictions on :matches() and :not().

That means :is(), :where() and :not() pseudo-classes can be combined at will.

The behavior in current browsers:

If it doesn't affect any other part of the code, I would say not impose any limitation on nested :not()s, since that's the intention in the Level 4 spec and it makes the implementation simpler.

danburzo commented 4 years ago

I don't think we should remove :matches completely. Should we ?

Agreed, it makes sense to keep :matches around.

domenic commented 4 years ago

It would be good to remove :matches(); jsdom at least would prefer to be spec-compliant, and :matches() does not appear in any spec.

dperini commented 4 years ago

@domenic I could remove it, but what about Chrome, Safari, IE et all ? From what I understand ':matches()' pseudo-class is still the only one valid there ! Or am I missing something ?

danburzo commented 4 years ago

According to caniuse:

So, on second thought, there doesn't appear there was much backwards compatibility we're forfeiting here. Furthermore, due to how :matches() was implemented, it did not work correctly for a list of selectors (:matches(h1, h2) would try to find elements that are both h1 and h2).

ExE-Boss commented 4 years ago

@domenic

It would be good to remove :matches(); jsdom at least would prefer to be spec-compliant, and :matches() does not appear in any spec.

According to Selectors 4, :matches() is a valid legacy alias:

As previous drafts of this specification used the name :matches() for this pseudo-class, UAs may additionally implement this obsolete name as an alias for :is() if needed for backwards-compatibility.

dperini commented 4 years ago

@danburzo @ExE-Boss ... second try ;-) for now I removed the nesting constraints only from the :is()//:where()/:matches() and kept them for the :not() pseudo-class. I have done some minimal testing and so far it looks the patch is working as expected, however I know there will be adjustments.

Here are some extra test that I will be including in my testing environment related to these pseudo-classes:

document.documentElement.matches(':is(html, :first-child)') == true document.documentElement.matches(':is(html, :last-child)') == true document.documentElement.matches(':is(html, :not(:first-child))') == false document.documentElement.matches(':is(html, :not(:last-child))') == false

document.head.matches(':is(head, :first-child)') == true document.head.matches(':is(head, :last-child)') == false document.head.matches(':is(head, :not(:first-child))') == false document.head.matches(':is(head, :not(:last-child))') == true

document.body.matches(':is(body, :first-child)') == false document.body.matches(':is(body, :last-child)') == true document.body.matches(':is(body, :not(:first-child))') == true document.body.matches(':is(body, :not(:last-child))') == false

Thank you for reviewing and help me apply these changes.

ExE-Boss commented 4 years ago

All the following cases are supposed to return true, since the :is(…) and :where(…) pseudo‑classes act like a logical OR, just like the Selector List , separator:

document.documentElement.matches(':is(html, :first-child)') === true;
document.documentElement.matches(':is(html, :last-child)') === true;
document.documentElement.matches(':is(html, :not(:first-child))') === true;
document.documentElement.matches(':is(html, :not(:last-child))') === true;

document.head.matches(':is(head, :first-child)') === true;
document.head.matches(':is(head, :last-child)') === true;
document.head.matches(':is(head, :not(:first-child))') === true;
document.head.matches(':is(head, :not(:last-child))') === true;

document.body.matches(':is(body, :first-child)') === true;
document.body.matches(':is(body, :last-child)') === true;
document.body.matches(':is(body, :not(:first-child))') === true;
document.body.matches(':is(body, :not(:last-child))') === true;
dperini commented 4 years ago

@ExE-Boss and with that you knocked me out ... more confused than ever, maybe that could be one reason browsers did delay this implementation ?

I haven't changed any behaviour in nwsapi, just renamed pseudo-classes, this might mean the following:

1) the functionality of :matches() was wrong even before the naming changes, but did not affect the few using it 2) there is no testing for these pseudo-classes in wpt or jsdom that I am aware of, the existing tests are all passing

I need to think about this and read more documents about the benefits that led specs authors to include such functionality. In the mean time if you have "code" changes suggestions please let me know.

danburzo commented 4 years ago

Yes, the old implementation of matches was wrong:

Furthermore, due to how :matches() was implemented, it did not work correctly for a list of selectors (:matches(h1, h2) would try to find elements that are both h1 and h2)

Please see the suggested changes I've included above, they seem to work (translating is / where to s.match()).

the :is(…) and :where(…) pseudo‑classes act like a logical OR, just like the Selector List , separator

As an example, the :is(h1, h2) selector matches elements that are either h1 or h2, identically to the h1, h2 selector.

dperini commented 4 years ago

@danburzo thank you for the fix, explaining it and bear with my misunderstandings. Same said for @ExE-Boss and his patience.

There would be a clean up for the obsolete 'D / not' variables and a few adjustments in the tests for the changes in the nested behaviour.

danburzo commented 4 years ago

@dperini Looks good! What is the preferred way to create and run new tests for these changes?

I've also identified Patterns.logicalsel:

/^:(?:(is|where|matches|not)\x28\s?([^()]*|[^\x28]*\x28[^\x29]*\x29)\s?(?:\x29|$))(.*)/i

Does not properly extract the parts out of :where(:is(:not(ul))) li:

[ ":where(:is(:not(ul))) li ", "where", ":is(:not(ul)", ") li " ]

...which causes the selector to be interpreted as invalid. I think the regex is not geared towards this kind of nesting, but I'll need to wrap my head around the regex to be able to suggest an alternative.

danburzo commented 4 years ago

My guess for matching logical combinators would be:

/^:(?:(is|where|matches|not)\x28(.*)\x29)(.*)/i

In raiload diagram form:

image(1)

In prose form:

String starts with : + one of is, where, matches, or not, with anything between parentheses, followed by any combination of characters.

The code change for this:

diff --git a/src/nwsapi.js b/src/nwsapi.js
index c39c475..9eb1b96 100644
--- a/src/nwsapi.js
+++ b/src/nwsapi.js
@@ -76,7 +76,7 @@
   GROUPS = {
     // pseudo-classes requiring parameters
     linguistic: '(dir|lang)\\x28\\s?([-\\w]{2,})\\s?(?:\\x29|$)',
-    logicalsel: '(is|where|matches|not)\\x28\\s?([^()]*|[^\\x28]*\\x28[^\\x29]*\\x29)\\s?(?:\\x29|$)',
+    logicalsel: '(is|where|matches|not)\\x28(.*)\\x29',
     treestruct: '(nth(?:-last)?(?:-child|-of-type))(?:\\x28\\s?(even|odd|(?:[-+]?\\d*)(?:n\\s?[-+]?\\s?\\d*)?)\\s?(?:\\x29|$))',
     // pseudo-classes not requiring parameters
     locationpc: '(link|visited|target)\\b',

I'm not sure why there's an (?:\\x29|$)(a closing parenthesis or the end of the line) there, is there a case I'm missing? Seems to suggest that pseudo-classes left open, such as :is(ul are valid selectors.

dperini commented 4 years ago

You guessed it correctly about (?:\x29|$). You will see that pattern spread everywhere in nwsapi RegExp.

"That is how selector/CSS parsing works." So I was told by Anne van Kesteren and Tab Atkins. Expand and follow the threads around these same dates :https://twitter.com/annevk/status/924253443920990209 This opened up many questions and answer about this unknown aspect. I just hope my interpretation was correct and the implementation reflects reality.

I will take your observations and think how to rework the RegExp for parsing the multiple arguments (for :is() and similar).

danburzo commented 4 years ago

Ah, thanks for the reference. That does complicate things a bit...

ExE-Boss commented 4 years ago

The bug is caused by the fact that Regular Expressions can’t be recursive in JavaScript (this is also why PrismJS only supports a few levels of nesting).

The only real way to fix it fully would be to implement a full CSS Selector syntax parser.

vincenthongzy commented 2 years ago

Hey there, could we close this issue since its already fixed and released in v2.2.1 ? I was tracking some problems related to the :is pseudo-class being thrown as Syntax Error in my jest tests, then I stumbled upon this issue. Luckily I checked the commit history, otherwise I would be blocked for a longer period of time.