w3c / webdriver-bidi

Bidirectional WebDriver protocol for browser automation
https://w3c.github.io/webdriver-bidi/
366 stars 41 forks source link

Shadow DOM selectors in WebDriver (BiDi) #342

Closed OrKoN closed 1 year ago

OrKoN commented 1 year ago

WebDriver provides multiple strategies for querying selectors but none of the strategies allows querying across shadow roots in one request. Querying elements inside shadow roots is a common automation task that is especially relevant for automated testing. The question how query elements inside shadow roots is asked often and the way that is currently recommended is to do querying step by step (first select the host, then select element inside the host's shadow root and so on). It means that element selector cannot be captured with a single selector. Instead, querying elements in shadow roots is a mix of multiple selectors and code. On top of that, various frameworks might offer different ways to deal with the problem and invent custom and largely incompatible syntaxes. Approaches we have seen so far:

1) storing a list of selectors to indicate that each of the selector (except for the last one) identifies the shadow DOM host. 2) re-defining the behaviour for the CSS combinators such as the child/descendant 3) storing the entire JS code to query the shadow DOM elements.

We think there might be an opportunity for WebDriver to standardize the syntax for combining selectors to indicate the queries for shadow DOM elements. It could be potentially included into the WebDriver BiDi and WebDriver Classic standards.

One proposal would be to support the >>> (deep descendant combinator) and >>>> (deep child combinators) in CSS selectors to indicate shadow roots in a query. The deep descendant combinator was previously defined in the webcomponents spec but was removed: https://github.com/WICG/webcomponents/issues/78

One way would be to support something similar to CSS extensions spec: https://drafts.csswg.org/css-extensions/ but it might clash with the actually CSS implementation.

One requirement to consider is that extending the CSS selectors might not be sufficient. Ideally, the syntax allows combining multiple locator strategies in one query. For example, the shadow host might be selected using CSS and the element inside it's shadow root using text selectors. For example, using the >>> syntax that would apply to multiple strategies: div > some-host >>> link-text-strategy("my link text").

Initially, we think that syntax should only work for open shadow roots. So far, we have not received the feedback that querying closed shadow roots is desirable. Since closed shadow roots are not accessible via DOM APIs, this kind of selectors probably should allow querying them as well.

css-meeting-bot commented 1 year ago

The Browser Testing and Tools Working Group just discussed Shadow DOM selectors in WebDriver (BiDi).

The full IRC log of that discussion <jgraham> Topic: Shadow DOM selectors in WebDriver (BiDi)
<jgraham> github: https://github.com/w3c/webdriver-bidi/issues/342
<jgraham> scribenick: jgraham
<jgraham> orkon: Is this in the scope of WebDriver/BiDi and if it is how do we move forward? The problem is there's currently no way to address elements inside the shadow DOM in BiDi. Only scoped to open shadow roots. Currently need to do DOM walking by hand. Current solutions are verbose and don't allow combining different selectors.
<orkon> `div > some-host >>> link-text-strategy("my link text")`
<jgraham> orkon: Proposal is to reuse selectors that were removed from the CSS spec. Would allow multiple stratgies for querying.
<jgraham> orkon: Would allow combining CSS selectors with locator stratgies that would follow a consistent sytax. WebDriver classic has a set of such stratgies already e.g. xpath.
<jgraham> q+
<jgraham> ack whimboo
<jgraham> orkon: That's the proposal to discuss.
<jgraham> ack jdescottes
<simons> q+
<jgraham> ack jgraham
<JimEvans> q+
<jgraham> jgraham: From a user point of view I see how this makes sense, but I'm worried about collisions with future CSS syntax, and also with how you would actually implement this. But I think we'd need to talk to the CSSWG with our use cases.
<jgraham> orkon: Some of this would be WebDriver specific, don't propose adding it to the actual selector engines. But more research is needed on how to avoid the compat hazard.
<jgraham> ack simons
<jgraham> simons: I'm normally a fan of pushing complexity into the remote end. But in this case I think this should be in the local end. We have the primitives in the spec to allow you to implement this in the spec, with the parsing happening on the local end. That would remove the worry about something colliding with future CSS. It also means that you can mix-and-match style as in the example.
<jgraham> simons: There is some precendent for pushing this kind of thing into clients e.g. orginal webdriver had find by element name, but that's now only implemented in client APIs rather than directly in the protocol.
<jgraham> ack JimEvans
<jgraham> JimEvans: I agree with simons. This sounds like something that from an implementation standpoint it feels like something that would mean that remote end implementations would have to develop a selector parser, and that seems like a complex thing to spec. So I think it should perhaps live in the local end.
<jgraham> q?
<jgraham> orkon: These are valid arguments. We do see many libraries implement this over and over again, and they aren't complatible with each other. Makes it harder to use the protocol directly. It's hard to spec well and avoid conflicts. I'm not sure about the best tradeoffs.
<jgraham> ScribeNickL orkon
<jgraham> ScribeNick: orkon
OrKoN commented 1 year ago

I believe the conclusion is that this sort of functionality should be up to the clients to implement. Closing the issue.