This issue aims to discuss potentially aligning document targeting concepts between WebExtensions and WebDriver BiDi.
In the past year, the WebExtensions Community Group (WECG) has aligned around and introduced the concept of a documentID (Chromium design doc). This capability was introduced to address situations where the document an extension is targeting may change between when information about the document was received by the extension and when the desired action would be executed.
Let's take a look at an example. Previously, if an extension wanted to inject a content script into a the main frame of a website in response to some event, the extension could only target either the tab or frame by ID.
Unfortunately, there's a major flaw with this approach: tabs and frames are not guaranteed to be unique. As such, it's possible that by the time the script is injected into the specified target, the frame has already navigated to a new document or the target has otherwise been invalidated. This issue is also broader than script execution; it also applies to message passing and other instances where targeting of a specific destination context is necessary.
The concept of documentID is currently integrated with the WebExtensions browser.webNavigation and browser.scripting API and will be expanding into other APIs in the future. For example, there's currently high level alignment around introducing a new method (browser.runtime.getContexts()) to retrieve a list of contexts owned by the extension.
During our May 25, 2023 meeting, the WebExtensions Community Group discussed a feature request that would allow extensions to specify which document should evaluate a code string when calling browser.devtools.inspectedWindow.eval() (w3c/webextensions#393). I wanted to highlight that request with this group because of the close relationships between WebDriver BiDi and Chrome DevTools Protocol (CDP), and between CDP and the WebExtensions browser.devtools and browser.debugger APIs.
Would it make sense to integrate the concept of documentID in the WebDriver BiDi spec?
The Browser Testing and Tools Working Group just discussed Targeting documents by ID.
The full IRC log of that discussion
<AutomatedTester> Topic: Targeting documents by ID
<AutomatedTester> github: https://github.com/w3c/webdriver-bidi/issues/434
<AutomatedTester> jgraham_ (IRC): in bidi we route commands to a specific browsing context
<AutomatedTester> ... that is fine but it can cause problems and is subject to race conditions
<AutomatedTester> ... e.g. if the page is navigating when the command is sent that it would be executed in the wrong place
<AutomatedTester> ... in addition to target a context/navigable that you would be able to send commands to an id for a document
<sadym> q
<sadym> q+
<AutomatedTester> ... and if it couldnt go to that document that it would error instead of just going ahead
<AutomatedTester> ack next
<AutomatedTester> sadym (IRC): this document idea, do we want it to be used elsewhere?
<AutomatedTester> jgraham_ (IRC): we would want to make sure that it is sent back during the serialisation
<AutomatedTester> ... I havent thought about scoping events down to a document
<AutomatedTester> ... or network interception down to the document
<simons_> q+
<AutomatedTester> q?
<AutomatedTester> ack next
<AutomatedTester> simons: document IDs is a chromium idea from what I can see... is this browser agnostic ?
<AutomatedTester> ... or can we make it that way
<AutomatedTester> jgraham_ (IRC): we can define it for our purposes
<AutomatedTester> q?
The Browser Testing and Tools Working Group just discussed Document id & routing commands by document.
The full IRC log of that discussion
<AutomatedTester> topic: Document id & routing commands by document
<AutomatedTester> github: https://github.com/w3c/webdriver-bidi/issues/434
<jgraham> zakim, open the queue
<Zakim> ok, jgraham, the speaker queue is open
<AutomatedTester> scribenick: automatedtester
<AutomatedTester> jgraham: The proposal here is when we serialise a document that it has a unique document id that is different to the context/navigable
<orkon> q+
<AutomatedTester> ... the reason for doing this that some commands that don't want to just send to context but at the document
<AutomatedTester> ... the example is you want to run a script in a specific document in a context. If the document navigates then targetting the document means it won't run in the wrong place
<AutomatedTester> ... from a spec point of view this is how generally things work through the process and having a document id is easier to target
<AutomatedTester> ... and it removes the chance that what you think will happen might not happen
<AutomatedTester> ... in the original issue in github there is the same request from web extensions
<AutomatedTester> q?
<AutomatedTester> ack next
<Lola> present +
<AutomatedTester> orkon: I was wondering if targeting realms would solve this since a realm is tied to the document lifetime
<AutomatedTester> jgraham: you could specify the realm for execute script but in the case of locateNode and navigate you want it to be specific to that document
<AutomatedTester> orkon: I was meaning that realm could be a sandbox and we target that
<AutomatedTester> ... do we feel that it is still useful to target a context?]
<jgraham> q?
<AutomatedTester> jgraham: I think this adds complkexity as it means there is more than 1 to do it
<AutomatedTester> ... and we have to do something in the implmentation to route things to the right document and the error cases
<AutomatedTester> ... and for subscribing to events we want it to the document
<AutomatedTester> ... I think it we would need to probably support both of context and docuemnt
<AutomatedTester> ... I think the high priority item is to have serialisable document
<AutomatedTester> ... that then opens up the opportunity to do any other changes later
<AutomatedTester> orkon: it is possible for a client to learn when things have executed in the wrong space so I think this could be good
<AutomatedTester> q?
<AutomatedTester> jgraham: so we have agreement that we will serialise the document id back and at a later stage update the other commands
This issue aims to discuss potentially aligning document targeting concepts between WebExtensions and WebDriver BiDi.
In the past year, the WebExtensions Community Group (WECG) has aligned around and introduced the concept of a documentID (Chromium design doc). This capability was introduced to address situations where the document an extension is targeting may change between when information about the document was received by the extension and when the desired action would be executed.
Let's take a look at an example. Previously, if an extension wanted to inject a content script into a the main frame of a website in response to some event, the extension could only target either the tab or frame by ID.
Unfortunately, there's a major flaw with this approach: tabs and frames are not guaranteed to be unique. As such, it's possible that by the time the script is injected into the specified target, the frame has already navigated to a new document or the target has otherwise been invalidated. This issue is also broader than script execution; it also applies to message passing and other instances where targeting of a specific destination context is necessary.
The concept of documentID is currently integrated with the WebExtensions
browser.webNavigation
andbrowser.scripting
API and will be expanding into other APIs in the future. For example, there's currently high level alignment around introducing a new method (browser.runtime.getContexts()
) to retrieve a list of contexts owned by the extension.During our May 25, 2023 meeting, the WebExtensions Community Group discussed a feature request that would allow extensions to specify which document should evaluate a code string when calling
browser.devtools.inspectedWindow.eval()
(w3c/webextensions#393). I wanted to highlight that request with this group because of the close relationships between WebDriver BiDi and Chrome DevTools Protocol (CDP), and between CDP and the WebExtensionsbrowser.devtools
andbrowser.debugger
APIs.Would it make sense to integrate the concept of documentID in the WebDriver BiDi spec?