w3c / webdriver-bidi

Bidirectional WebDriver protocol for browser automation
https://w3c.github.io/webdriver-bidi/
379 stars 42 forks source link

Web Extensions: install / uninstall #548

Open sadym-chromium opened 1 year ago

whimboo commented 1 year ago

This topic has been discussed at TPAC 2023. The minutes can be found at: https://www.w3.org/2023/09/15-webdriver-minutes.html#t04

whimboo commented 9 months ago

Maybe the definition of extensions in the capabilities could look like the following:

{
  "extensions": [
    "<base64_encoded_extension>",
    {"extension": "<base64_encoded_extension>", "installTemporary": true, "allowPrivateBrowsing": true},
  ],
}

That means by default extensions could be listed as usual as a list of base64 encoded strings, or if more options are needed for the install process an object could be passed with additional flags - all of them optional and default to false?

OrKoN commented 6 months ago

CDP now supports loading unpacked extensions at runtime https://chromedevtools.github.io/devtools-protocol/tot/Extensions/#method-loadUnpacked (with some restrictions such as a special flag + pipe connection)

css-meeting-bot commented 2 months ago

The Browser Testing and Tools Working Group just discussed WebExtensions CG.

The full IRC log of that discussion <AutomatedTester> topic: WebExtensions CG
<AutomatedTester> github: https://github.com/w3c/webdriver-bidi/issues/548
<robwu> https://github.com/w3c/webdriver-bidi/pull/778
<jimevans> jgraham WebExtensions CG would like to extend testing in WPT
<jgraham> q+
<jimevans> To do this we need to be able to install and uninstall extensions at runtime
<jimevans> ack
<simonstewart> q+
<AutomatedTester> ack jgraham
<jimevans> From the WebDriver BiDi point of view, a PR has been submitted.
<AutomatedTester> s/From the WebDriver BiDi point of view, a PR has been submitted./jgraham: From the WebDriver BiDi point of view, a PR has been submitted.
<xenon> q+
<jimevans> jgraham: It's a useful use case. And the PR is a reasonable start.
<oliverdunk> q+
<jimevans> jgraham: At the moment, there are issues with definitions of what defines an extension, and we will need to do a lot of definition (or hand waving) to be able to fully spec the behaviour
<jimevans> ack simonstewart
<simonstewart> https://github.com/SeleniumHQ/selenium/blob/43eb1e5477c9ef30bf6e2097ef984dec3f25077f/java/src/org/openqa/selenium/firefox/HasExtensions.java
<jimevans> simonstewart: Seconding what jgraham said. Firefox's WebDriver extension already has some methods for extension installation/uninstallation.
<AutomatedTester> ack xenon
<jgraham> qq+
<AutomatedTester> ack jgraham
<Zakim> jgraham, you wanted to react to xenon
<jimevans> xenon: Biggest concern is that we (Apple) don't have an implementation of WebDriver BiDi. We would need something in WebDriver classic sooner rather than later rather than relying on a BiDi implementation.
<AutomatedTester> ack oliverdunk
<sadym> q+
<jimevans> jgraham: We already implement extension methods in Firefox, so I don't see any issues with specifying it in WebDriver classic. Some may have issues
<simonstewart> q+
<jimevans> oliverdunk: I'm confused about the order of operations. Do you need to add to WebDriver BiDi first, before adding to WebDriver classic?
<AutomatedTester> ack sadym
<jgraham> q+
<jimevans> oliverdunk: For non-WPT use cases it may be useful to be able to send an extension over the wire to a remote end.
<jimevans> sadym: Implementation-wise, for chromedriver, we would prefer to implement BiDi first, rather than classic.
<AutomatedTester> ack simonstewart
<jimevans> Not impossible, but preferrable.
<simonstewart> q-
<AutomatedTester> ack jgraham
<jimevans> sadym: it's not a blocker, to implement in classic first.
<jimevans> jgraham: As far as the existing PR, you either send a path to a zipped file, or a binary encoded zipped file. Similar to what we do for profiles in Firefox.
<jimevans> jgraham: That seems to work well, so there's nothing technically infeasible about it.
<xenon> q+
<AutomatedTester> ack xenon
<jimevans> jgraham: If we were concerned about file system access, it would be completely reasonable to use something likc base64 encoded file as a transmission mechanism.
<jimevans> xenon: We will need to make this work for iOS also so sending across the wire is probably an approach we will be needing. We can't necessarily rely on file paths.
<simonstewart> q+
<gsnedders> q+
<robwu> q+
<jimevans> jgraham: For the Firefox use case, for certain use cases, paths are more efficient, but it's not required for a given implementation.
<AutomatedTester> ack simonstewart
<jimevans> [discussion about WPT use case]
<jimevans> simonstewart: For the Selenium use case, since we never can be certain of the availability of the location, we will probably use the base64-encoded case as well.
<AutomatedTester> ack gsnedders
<jimevans> gsnedders: This is not the only case we've discussed recently where the local and remote ends are not the same.
<jimevans> gsnedders: Maybe we need an indicator of whether the local and remote ends are colocated? But jgraham is looking aghast at that suggestion.
<AutomatedTester> q?
<jimevans> jgraham: it's probably best not to rely on this.
<AutomatedTester> ack robwu
<zombie8> q+
<jimevans> robwu: I'd like to make sure we work with something portable, so that the user can be sure that it is reliable.
<AutomatedTester> ack zombie8
<jimevans> [general agreement]
<dotproto> +q
<jgraham> q+
<jgraham> ack zombie
<jimevans> zombie8: When we write extension APIs, we need a way to communicate with the extension. I'm guessing those types of commands would be easier to add in BiDi. Is that the proper way to think about it?
<simonstewart> q+
<dotproto> ack jgraham
<jimevans> jgraham: The answer is "yes, in principle" but in practice, there's nothing to prevent another approach.
<gsnedders> RRSAgent, make minutes
<jimevans> jgraham: In BiDi, you have access to more realms for execution of script and run scripts in those context. We also have mechanism for communication back.
<RRSAgent> I have made the request to generate https://www.w3.org/2024/09/26-webdriver-minutes.html gsnedders
<jimevans> jgraham: In principle, BiDi is well positioned for extensions.
<jimevans> zombie8: Do we have a way to access the extension context?
<jimevans> jgraham: Like a content script?
<AutomatedTester> ack dotproto
<jimevans> [discussion around clarification of how extension work and what contexts are available]
<xenon> q+
<jimevans> dotproto: It feels like it's probably too early for this, but there's a concept of native messaging. Is there a way we can automate and test that messaging?
<xenon> q-
<jimevans> [general discussion, feels like it's out of scope for now, because of platform differences]
<AutomatedTester> ack simonstewart
<jgraham> q+
<jimevans> simonstewart: Feels like some design work to be done. In classic, it seems like it might be enough for install/uninstall end points. For BiDi, it might be better to define certain events around the lifecycle of extensions.
<AutomatedTester> ack jgraham
<jimevans> simonstewart: Are there permissions that may be requested from extensions?
<jimevans> [unknown] Yes, it may require some extensions
<xenon> xenon^
<AutomatedTester> q?
<robwu> q+
<simonstewart> s/[unknown] /xenon: /
<jimevans> jgraham: More than just a simple command may be a better fit for BiDi.
<AutomatedTester> ack robwu
<jgraham> Specifically the claim that we can do this in classic was limited to install/uninstall, not other things
<dotproto> s/Specifically the claim that we can do this in classic was limited to install/uninstall, not other things/jgraham: Specifically the claim that we can do this in classic was limited to install/uninstall, not other things/
<jimevans> robwu: There are sometimes subtle differences between temporary or permanent installs.
<oliverdunk> q+
<dotproto> qq+
<AutomatedTester> ack dotproto
<Zakim> dotproto, you wanted to react to robwu
<jimevans> robwu: Packaging format also sometimes informs that as well.
<dotproto> dotproto: temporary is a Firefox concept
<jimevans> simonstewart: It may be pointed out that WebDriver sessions are mostly ephemeral.
<jgraham> q+
<dotproto> Kiara: temporary also applies to Safari
<jimevans> robwu: I would like the testing mechanism for extensions to be as close to production as possible.
<jimevans> q?
<AutomatedTester> ack oliverdunk
<simonstewart> q+
<jimevans> oliverdunk: If I'm understanding correctly, you want another install mode where the extension "acts like" it was from an extension store, but actually isn't.
<AutomatedTester> ack jgraham
<jimevans> robwu: We want to think about that as something we accept in the protocol.
<xenon> q+
<jimevans> jgraham: I think from a protocol point of view, it seems plausible (semantics notwithstanding). For the WPT case, there may be browser-specific workarounds to handle the permanent install case.
<AutomatedTester> ack simonstewart
<jimevans> jgraham: If you don't have the ability to run with specific profiles, we may need to investigate.
<Kiara> q
<jimevans> simonstewart: One thing to bear in mind, the things we put into the WebDriver spec are things that should be put into all browsers.
<jimevans> simonstewart: For browser-specific things, extension commands may be the proper course.
<Kiara> q+
<jimevans> simonstewart: You may want to first focus on "packed" extensions, before attempting to support unpacked directories.
<jimevans> oliverdunk: terminology clarification on "packed" (signed, from a store) vs. "unpacked" vs. "zipped file"
<jimevans> ack xenon
<AutomatedTester> ack Kiara
<xenon> q+
<jgraham> q+
<oliverdunk> q+
<AutomatedTester> ack xenon
<jimevans> Kiara: Like native messaging, storage might be interesting too. Have you thought about local storage vs. session storage and how to manage that?
<jimevans> xenon: One other note about "temporary-ness", when we know we are in a testing harness, we set timing for alarms, which is another item to consider along with storage.
<AutomatedTester> q?
<AutomatedTester> ack jgraham
<jimevans> jgraham: On storage at the moment, WebDriver does not define any APIs about storage at present. Seems like that would be useful, but it's not something we have today.
<jimevans> xenon: Extensions already have APIs for inspecting their own storage, so they can test that, and there may not need to be any need for external examination.
<AutomatedTester> ack oliverdunk
<jimevans> jgraham: Just pointing out it might have an effect on temporary vs. permanence.
<robwu> q+
<jimevans> oliverdunk: I can't immediately think of situations where storage is super important, so it sounds like treating it as temporary would be okay for now.
<jgraham> qq+
<AutomatedTester> ack jgraham
<Zakim> jgraham, you wanted to react to oliverdunk
<jimevans> oliverdunk: I'm interested in the WebDriver philosophy on this.
<gsnedders> qq+
<jimevans> jgraham: There are lots of reasons we might want to have storage in scope for WebDriver BiDi. There are currently hacks in WPT tests for clearing storage, etc. So we might entertain the notion of exposing that for WebDriver.
<AutomatedTester> ack gsnedders
<Zakim> gsnedders, you wanted to react to jgraham
<oliverdunk> qq+
<jimevans> gsnedders: Different browsers have had different notions of partitioning and storage, and knowing what storage realm something has been written to is not necessarily easy cross-browser.
<AutomatedTester> ack oliverdunk
<Zakim> oliverdunk, you wanted to react to gsnedders
<jimevans> oliverdunk: extension storage is generally more privileged, so that may not be too bad.
<AutomatedTester> ack robwu
<jimevans> robwu: Extensions also have a concept of "split mode."
<simonstewart> qq+
<AutomatedTester> ack simonstewart
<Zakim> simonstewart, you wanted to react to robwu
<jimevans> robwu: The current PR has a command for removing an extension. It would be useful to have a way to know an extension has been started, so that installing a subsequent extension or an operation of an extension is working.
<jgraham> q+
<jimevans> simonstewart: The event emission in BiDi is probably best suited for that kind of operation.
<jimevans> simonstewart: Look at the current BiDI network interception for an example that might more closely model what you're asking for.
<AutomatedTester> q?
<AutomatedTester> ack jgraham
<jimevans> jgraham: A written concrete example might be more useful to reason about. But the suggestion for BiDi sounds more like the proper model to allow for waiting for something to happen in an extension, and then do something else.
<jimevans> jgraham: But it's probably useful to start with the simple case of installing an extension, then moving to more complex cases.
<jimevans> jgraham: Next steps, we should consider the current PR, and folks from Web Extensions should review it and see how much of your requirements that it meets.
<jimevans> jgraham: so the next step is to land the existing PR.
<jimevans> [discussion of process for who should review and approve]
<jimevans> jgraham: Let me know who should be named as named reviewers.
<AutomatedTester> q?
<jimevans> [more discussion around iteration on PR]
<jimevans> jgraham: We haven't had a terribly formal lifecycle on how PRs are landed. We generally land it when there's consensus, and we can change and iterate also.
<jimevans> action: robwu to create document describing requirements from the Web Extensions CG
<simonstewart> q+
<jimevans> oliverdunk: How should we collaborate going forward?
<jimevans> AutomatedTester: "It depends."
<jimevans> AutomatedTester: It matters whether it's something that would be useful to everyone or more useful to just the Web Extensions CG. We are always open to collaboration however.
<simonstewart> q-
<jimevans> jgraham: Process-wise, if you want to open issues, feel free. We also have monthly face to face meetings which we are more than happy to put your items on the agenda and discuss.
<gsnedders> RRSAgent, make minutes
<AutomatedTester> RRSAgent: make minutes
<RRSAgent> I have made the request to generate https://www.w3.org/2024/09/26-webdriver-minutes.html gsnedders
<RRSAgent> I have made the request to generate https://www.w3.org/2024/09/26-webdriver-minutes.html AutomatedTester
<jgraham> RRSAgent: this meeting spans midnight
<RRSAgent> ok, jgraham; I will not start a new log at midnight