Closed sadym-chromium closed 6 months ago
Would be nice to get input here from people using the protocol in the wild rather than just guessing. Obviously @mathiasbynens and the other Puppeteer people are good candidates, but @shs96c @jimevans and others from Selenium should pitch in, as well as other users like @CanadaHonk.
From a spec point of view I think workers are already supported; obviously implementation wise it's a bit different.
I agree that various kinds of device emulation and overrides (media queries, timezone, etc.) seem like obvious next steps, because I understand that those are widely used in existing clients. I think there are also a couple of more basic things missing from the existing roadmap:
Also for "incognito" mode, I think we need some clear thought around requirements. Is it just about opening a new incognito window? Firefox probably wants to also be able to expose containers, since they can provide storage isolation between different tests.
Personally, #157 is a major blocker as otherwise there is no (easy) way to have bidirectional communication between a client and a page. Also #119 seems like something which might be often used / handy.
Cookies (reading/writing) are also important for Puppeteer. From Puppeteer perspective changing the viewport size (part of the device emulation) is more important than sizing windows (which I think we don't even support). Here is the list of all capabilities Puppeteer needs based on our test suite: https://gist.github.com/OrKoN/0c198de4089e8668a48746b223a37d8d
The Browser Testing and Tools Working Group just discussed Extend the roadmap
.
Window sizing/positioning is important for Selenium, as is cookie manipulation (the latter is more important to implement first than the former). These are the biggest things missing from the roadmap as it currently stands from the Selenium perspective.
It would probably be useful to have some primitives for node inspection for things like "is this element obscured by another element," and "is this element currently in the view port, and in scrolled into the visible portion of any/all enclosing elements' overflow areas." Those are challenging to create pure JavaScript/DOM implementations for. Note that I am expressly not asking for an "is node visible" primitive, as useful as that would be, because there's no easy way to come to a consensus on what "visible" means in spec language. A mechanism for determining when animations of a node are completed would be a significant win for BiDi-based clients.
I want to second the suggestion for expanding taking of screenshots to include both full screen, and cropped to a single node.
History traversal is important for API completeness, and there's an extant PR, but it needs work to integrate with changed upstream specs.
I would love to see more standardisation around browser specific capabilities to help spawn a browser in a certain state. There is already a lot of alignment among Firefox and Chrome while having a lot of unknowns in Safari. However while specifics can be up to the browser implementation, it seems we can agree on the following things to be very commonly used:
What I don't know is if that is something that can be specified in this spec. However it seems we kind of already have agreed on them so why not removing the remaining ambiguities.
Another suggestion could be about helping the user force element states for an element that does not depend on the CDP area as mentioned in this Issue: https://github.com/w3c/webdriver-bidi/issues/249
On the other hand I haven't found anything in this repository about file upload. This is something that has been left behind in W3C WebDriver specification and it used to work before W3C. For example, chromedriver
can attach files to forms over the wire; or in Selenium terms, the local file detector works.
But this feature is not covered by the W3C WebDriver standard and it's not clear if it will be.
WebDriver BiDi should also cover file download/upload features. After all, they can be important features to test in web applications.
I would love to see more standardisation around browser specific capabilities to help spawn a browser in a certain state. There is already a lot of alignment among Firefox and Chrome while having a lot of unknowns in Safari. However while specifics can be up to the browser implementation, it seems we can agree on the following things to be very commonly used:
- binary path
- args: command line arguments to start browser
- prefs: browser specific configurations
@christian-bromann these have to do with launching the browser, which BiDi doesn't do itself, it's part of https://w3c.github.io/webdriver/. There's an example in https://w3c.github.io/webdriver/#example-5 for browser specific configuration, is the request to standardize that?
log level
@christian-bromann can you elaborate on this? Is it the browser's internal logging?
@etanol can you elaborate a bit on the upload scenario? Do you mean using WebDriver to select a file for <input type=file>
? Or do you want something like network intercept and the ability to replace the data the browser would otherwise send? I'm not familiar with the chromedriver feature you're referring to.
@sadym-chromium found https://www.selenium.dev/documentation/webdriver/elements/file_upload/ which is how file upload works in Selenium. Send the name of the file to the <input type=file>
as keyboard input :)
these have to do with launching the browser, which BiDi doesn't do itself, it's part of https://w3c.github.io/webdriver/.
I was assuming that capability extensions defined in WebDriver would apply the same way to Bidi and its session.New
command. Maybe I am confusing something here 🤔
There's an example in https://w3c.github.io/webdriver/#example-5 for browser specific configuration, is the request to standardize that?
Yes, it would be nice if browser binary launch args can be passed in a standardised way to the driver. With prefs
I am not sure if this can be done as I am not aware how browser preferences are managed cross browser. However it would be nice to have some sort of control over that too.
can you elaborate on this? Is it the browser's internal logging?
Yes, but in hindsight I am not sure anymore what valuable use cases I could get out of this. The only one I could thing of is to provide more information when browser crash.
I was assuming that capability extensions defined in WebDriver would apply the same way to Bidi and its session.New command
I think that's "true", but there's a bit of confusion from the fact that WebDriver conflates two things:
For WebDriver classic that makes a certain amount of sense, and the *driver
binaries provide the intermediate layer that handles process control (and also in practice convert the protocol into some internal format).
For BiDi there are more use cases that work without assuming the protocol needs to launch a browser instance (e.g. a profiling application that uses network events to log the traffic of a browser instance that's separately controlled by some other automation tool), and other protocols like CDP don't assume that the protocol itself is launching/configuring the browser (which obviously doesn't make sense for devtools).
I've filed https://github.com/w3c/webdriver-bidi/issues/447 with some more thoughts about initial configuration.
@etanol can you elaborate a bit on the upload scenario? Do you mean using WebDriver to select a file for
<input type=file>
? Or do you want something like network intercept and the ability to replace the data the browser would otherwise send? I'm not familiar with the chromedriver feature you're referring to.
I'm refering to the former. The W3C standard does not include mechanisms to attach files to web forms over the wire (and w3c/webdriver#1355 doesn't bring much hope). You can remotely fill a file form file with the file path, but it will only work if the file is co-located in the same filesystem as the browser.
Before the W3C specification, some drivers were capable of receiving the contents of a file to be attached in a form; from the client (i.e. the process executing the end-to-end tests). This is still possible in chromedriver
as a non-standard endpoint. And the Selenium Java client also carries over some baggage from the past (see the implementation, exposed as the LocalFileDetector
).
In the face of browser farm providers, managing to place a file in the same filesystem where the browser reads from, is almost the same kind of problem that test writers face when having to wait for a file to be downloaded completely.
Hence my reference to #427: If there are improvement plans for file downloads, please don't forget about file uploads.
The Browser Testing and Tools Working Group just discussed Roadmap
.
Implemented.
As we are quite close to specify all the scenarios from the roadmap, we have to extend it. This are scenarios we think makes sense to work on:
Anything else we should consider adding there?