w3c / webdriver-bidi

Bidirectional WebDriver protocol for browser automation
https://w3c.github.io/webdriver-bidi/
336 stars 35 forks source link

Consider removing `innerText` locator from `browsingContext.locateNodes` in favor of `xpath` #687

Closed sadym-chromium closed 3 months ago

sadym-chromium commented 3 months ago

From the Chromium implementations perspective, the innerText selector will be implemented via constructing xPath. If it is a case for other implementations, we should consider delegating construction of this xPaths to clients, and keep only css and xpath locators.

sadym-chromium commented 3 months ago

@jimevans WDYT?

jimevans commented 3 months ago

@sadym-chromium Does the xpath locator take into account style information (element visibility, etc.) as innerText does? If not, then no, I would oppose that. I really, really, want to keep a separate, dedicated locator for finding by visible text (where "visible" is the key differentiator in that locator).

christian-bromann commented 3 months ago

From my point of view it would be much more helpful to expose a primitive to detect a visible state of an element and have the implementing framework run a sequential step if there is a desire to filter for elements to be visible. I am not too familiar with Selenium's Atoms though to understand if this is feasible though. Alternatively I'd suggest to make the visibility state a filter option as there are certainly use cases where I would like to find an element by innerText that is not visible.

jimevans commented 3 months ago

@christian-bromann There is no web primitive that can tell you if an element is "displayed," in the sense that WebDriver classic means "can be seen by the user." That's why the classic spec punts on the issue and defers to the Selenium atom, which is, and always has been, a suboptimal solution. That the WG didn't put a stake in the ground and say "this is what we mean when we say 'an element is displayed'" is a failure of the classic spec, in my opinion, though I know there are several people who disagree. If we can come to a consensus on what "displayed" means for the BiDi spec, I'm all for that.

At any rate, innerText already takes element styling, including visibility, into account. It's spec'ed to do so. If you want a locator for text of an element that may be hidden by styling, textContent is available, and just needs to be added (as an option to the innerText locator or as a separate locator, I'm fine with either). And while you might want to find an element by text that isn't visible in the element, I would submit, anecdotally, that many less sophisticated users want to find elements by their visible text or accessibility attributes over either XPath or CSS selectors (c.f. the rise in popularity of testing frameworks that choose to not base themselves on the WebDriver classic spec).

And the notion of "filtering" elements, as long as said filter operation requires a separate round-trip to the remote end for each element being queried is not a solution I favor either. Since I was talked out of composable locators, I'd rather see command batching before I see removal of something like this.

Sorry, folks, I remain opposed to removing a locator using contained text exclusive of XPath.

shs96c commented 3 months ago

I'm in agreement with @jimevans. The experience we've had from Selenium is that testers tend to look at the screen and just type the text that they see into their locators. Using innerText is the closest we can get to this in a standardised way. We've long discussed making the "find by text" implementation in WebDriver Classic work with that rather than the atom it currently uses.

Given that we're trying to reduce the chattiness of the protocol, anything which involves a round-trip to the remote end is also something to be avoided.

sadym-chromium commented 3 months ago

The innerText is set in case of the node is rendered and is visible (not hidden or collapse). We can extend NodeRemoteValue with 'visibility'.

User makes xPath request, gets the node they are interested in, and check it's visibility without an extra roundtrip. Would it be sufficient for the mentioned scenarios?

shs96c commented 3 months ago

I don't think so. My understanding is that the text function in XPath will return the equivalent of textContent and not innerText. In addition, I think if someone is looking for a node by its text, the the requirement to be visible is eminently sensible.

That is, if someone wanted to use XPath, they would have used an XPath locator. The semantics of finding by text is different from that.

OrKoN commented 3 months ago

I think we discussed this during the initial spec discussion and I think there are good arguments for innerText. @sadym-chromium let's close the issue?