w3c / webdriver-bidi

Bidirectional WebDriver protocol for browser automation
https://w3c.github.io/webdriver-bidi/
378 stars 42 forks source link

Support finding iframe elements by browsing context id #794

Open OrKoN opened 1 month ago

OrKoN commented 1 month ago

Puppeteer offers a way to get the frame owner element using the browsing context/navigable ID via the Frame.frameElement() API. Currently, the implementation in Puppeteer relies on finding all iframe/frame elements, serializing their contentWindow and comparing the ID of the serialized window to the id of the target frame. That does not work for Shadow roots as it is not possible to locate iframe/frame elements within closed shadow roots.

The following solutions are possible:

1) Add browsingContext.getFrameOwner(browsingContextId) -> sharedReference 2) Modify locateNodes to allow piercing shadow roots including DOM nodes 3) Extend locateNodes with a new locator type (e.g., FrameOwnerLocator)

Note that having a dedicated locator or method would allow for a more efficient implementation as generally implementations do not need to look over the entire dom tree to find a frame owner element for a given navigable. In the HTML spec terms this seems to correspond to the concept of a navigable container: https://html.spec.whatwg.org/multipage/document-sequences.html#navigable-container

Related Playwright feedback:

Implement DOM.getFrameOwner and DOM.describeNode to enable converting between DOM node handles and Frame handles. This would improve the implementation of Locator.contentFrame() and FrameLocator.owner(), as the current workaround using contentWindow does not work with shadow DOM, as seen in the page/frame-frame-element.spec.ts tests.

christian-bromann commented 1 month ago

This would be helpful for WebdriverIO as well as we currently have to maintain rather complicated mechanism to support automated piercing through shadow roots. If the user switches into an iframe we loose the ability to use the same mechanism for element lookups as it doesn't seem trivial to keep track of the context id the user is in. Users still like to switch to iframes by providing an element id to that element node. It would be helpful as well if Bidi could have an event for when user switch to different contexts. WebdriverIO could keep these changes in memory and use it for upcoming element look ups.

css-meeting-bot commented 1 week ago

The Browser Testing and Tools Working Group just discussed Support finding iframe elements by browsing context id.

The full IRC log of that discussion <AutomatedTester> topic: Support finding iframe elements by browsing context id
<AutomatedTester> github: https://github.com/w3c/webdriver-bidi/issues/794
<AutomatedTester> orkon: proposal of a way to find the container of the navigable by browsing context id
<AutomatedTester> ... there is a request from playwright and puppeteer
<AutomatedTester> ... there are multple ways that we can do this. I have written a PR of this is
<AutomatedTester> ... there is another way that whimboo suggested
<simons> q+
<AutomatedTester> ... I like this way as it like how we locate DOM nodes
<AutomatedTester> ack next
<jgraham> q+
<sadym> q+
<AutomatedTester> simons: I think that it should be a locator as that is how classic/selenium does this
<AutomatedTester> there is one wrinkle that people like to find iframes by id or name
<AutomatedTester> ack next
<AutomatedTester> jgraham: This is about finding iframes in a navigable which is different to classic. Using a locator feels weird but I can see the appeal
<AutomatedTester> ... I think that I would prefer a different command
<sadym> q-
<orkon> q+
<AutomatedTester> ... perhaps actually this is going to be fine and won't make much difference. I am happy the PR mostly as it is
<AutomatedTester> ack next
<AutomatedTester> orkon: regarding this PR. At the moment we discard the locator... I wonder if there is <missed question> and should we throw an error when using the locator or should we just ignore them
<AutomatedTester> q?
<sadym> q+
<AutomatedTester> jgraham: is the second question more of a CDDL question?
<AutomatedTester> orkon: not really... this there are other attributes
<AutomatedTester> ... and they are optional fields
<AutomatedTester> jgraham: I think this is pushing me to this a new command
<orkon> q+
<orkon> q-
<AutomatedTester> ... there is something to be said about having a command that doesn't take invalid commands
<AutomatedTester> ack next
<AutomatedTester> sadym: if we send command with locate nodes and then get a browsing context
<AutomatedTester> orkon: yes...
<AutomatedTester> sadym: this feels wrong. We are going to need to send other fields here
<orkon> q+
<AutomatedTester> orkon: we could have the search from the parent context to return the child
<AutomatedTester> sadym: this would change the way that the command works and feels weird
<AutomatedTester> jgraham: if we need to call getTree to be able to use the command that complicates this
<jgraham> RRSAgent, make minutes
<AutomatedTester> ack next
<RRSAgent> I have made the request to generate https://www.w3.org/2024/11/13-webdriver-minutes.html jgraham
sadym-chromium commented 1 week ago
jgraham: if we need to call getTree to be able to use the command that complicates this

Having only a sharedId without browsing context it belongs to does not make that much sense, so the user needs it anyways.