w3c / webdriver-bidi

Bidirectional WebDriver protocol for browser automation
https://w3c.github.io/webdriver-bidi/
363 stars 40 forks source link

Specify a network request interception feature #66

Closed jgraham closed 4 months ago

jgraham commented 3 years ago

A common use case for automation is to take a request that would be made e.g. to a third party API and provide a mock response for that reques, or to take a request that requires HTTP authentication and automatically provide that authentication. This is a feature that's exposed in Puppeteer and is in the works for Selenium.

The existing CDP API is the Fetch domain. Something similar has been added to WebKit. Gecko has similar features with a different API exposed to extensions.

Given that the CDP-ish model is already widely implemented in browsers and used in clients, it makes sense to spec something close to that. Practically this means:

Auth seems to be handled somewhat seperately, with the following:

css-meeting-bot commented 3 years ago

The Browser Testing and Tools Working Group just discussed Network interception.

The full IRC log of that discussion <AutomatedTester> Topic: Network interception
<simonstewart> Is there a link to the issue?
<AutomatedTester> jgraham: One of the common use cases for bidi from Selenium and we've seen it in puppeteer is the ability to see when a specific request happens and do a specific case
<AutomatedTester> ... such as basic auth
<AutomatedTester> ... or the othre request is network requests where we can say if X happens that we mock those network requests
<foolip> q+
<cb> q+
<AutomatedTester> ... I think this part of the "value add" feature sets
<AutomatedTester> ... I've been looking around at this and CDP has the Fetch domain
<AutomatedTester> ... [describes how CDP does this case]
<AutomatedTester> ... looking at WebKit it looks very similar
<AutomatedTester> ... Gecko has a very model in webextensions
<foolip> https://github.com/w3c/webdriver-bidi/issues/66
<simonstewart> q+
<AutomatedTester> Github topic: https://github.com/w3c/webdriver-bidi/issues/66
<AutomatedTester> ... so we have precendent in this area
<jgraham> a/very model/slightly different model/
<jgraham> s/very model/slightly different model/
<AutomatedTester> q?
<AutomatedTester> ack foolip
<jgraham> RRSAgent: make minutes v2
<RRSAgent> I have made the request to generate https://www.w3.org/2020/10/28-webdriver-minutes.html jgraham
<drousso> q+
<drousso> lol oops i used the wrong command :P
<AutomatedTester> foolip: is the model to both replace response without doing the response or can we edit and pass it back
<AutomatedTester> ... that's how I think CDP does this
<AutomatedTester> jgraham: I think there is a getResponse type command that you get it and then can edit it
<AutomatedTester> ... both seem like meaningful use cases
<AutomatedTester> q?
<AutomatedTester> ack cb
<AutomatedTester> cb: is the scope of this feature to capture all web resources?
<AutomatedTester> jgraham: I believe that the scope should be to all resource fetches
<brwalder> q+
<AutomatedTester> ... there is a question that I don't know the answer to which is how would service workers handle this
<AutomatedTester> q?
<AutomatedTester> ack simonstewart
<AutomatedTester> simonstewart: there are a lot of implementations for how intercepting of auth since there are cases like webauthn
<AutomatedTester> ... puppeteer doesn't seem to have a handle on this yet
<AutomatedTester> ... are we going to make it flexible enough
<AutomatedTester> jgraham: yes, those are important cases
<AutomatedTester> ... I am not sure they should be part of this API. I think they have already added webdriver features and they are best placed to handle this
<AutomatedTester> ... I've just looked and they have a VirtualAuthenticator for testing
<AutomatedTester> simonstewart: Ive seen that in the selenium code base
<AutomatedTester> q?
<AutomatedTester> ack drousso
<AutomatedTester> drousso: the reason we have added the support for things like this purely for webinspector, not for automation but can be used there
<AutomatedTester> foolip: can you replace request headers as you see fit?
<AutomatedTester> drousso: yes, and there are some pass throughs that means if people dont have send all the items
<AutomatedTester> q?
<AutomatedTester> ack brwalder
<AutomatedTester> brwalder: re: service workers. In CDP this is set at the network layer and would not work for service workers as that's different
<foolip> q+
<simonstewart> q?
<AutomatedTester> jgraham: that makes sense. I did have a question around filtering of this. How would we limit this to browsing context/realm?
<shengfa> There is also recent code merge in ChromeDriver for webauth(https://chromium-review.googlesource.com/c/chromium/src/+/2488183) with reference links in commit message.
<AutomatedTester> q?
<AutomatedTester> ack foolip
<AutomatedTester> foolip: a question for brwalder : Does this bypass the cache?
<AutomatedTester> brwalder: Enabling the fetch domain disables the cache
<AutomatedTester> simonstewart: that's not true you need to do it manually
<AutomatedTester> drousso: webkit turns it off
<AutomatedTester> foolip: so we need to be smart here when designing it
<AutomatedTester> simonstewart: I would like to discuss this more indepth
<AutomatedTester> ... I imagine people have in their minds what puppeteer do in the fetch domain
<AutomatedTester> ... are we wanting to have a similar model for the future
<AutomatedTester> ... I don't know how webkit do this
<AutomatedTester> drousso: I dont know CDP but will explain webkit
<AutomatedTester> .... webkit does something similar that it takes a pattern for an interception and at what stage
<AutomatedTester> ... this is only request or response currently
<brwalder> q+
<AutomatedTester> ... the engine will notice and emit an event to the front end
<AutomatedTester> ... requestContinue - carry on as normal
<AutomatedTester> ... [describes 3 other events]
<AutomatedTester> ... All of this par tof the network domain and not a separate domain
<AutomatedTester> jgraham: I don't think we should have a spralling mass of modukes
<AutomatedTester> ack brwalder
<AutomatedTester> brwalder: that sounds more or less like CDP Fetch
<AutomatedTester> ... and has similar workflow
<AutomatedTester> ... and if you want the cache disabled you have to do it manually
<AutomatedTester> ... and the same for service workers
<AutomatedTester> q?
<AutomatedTester> simonstewart: So my understanding is that there will be a command that takes a regex and then emits events and you can manipulate it as you see fit
<AutomatedTester> ... and when you get an authentication request you will intercept and insert credentials or we can deny it
<AutomatedTester> jgraham: I agree
<AutomatedTester> simonstewart: and one thing more that we implicitly disable the cache
<AutomatedTester> drousso: both chrome and webkit disable the cache if requested
<AutomatedTester> ... but when we are going to override anything we make sure that automatically disable the cache instead
<AutomatedTester> jgraham: I think we need to be clear in the spec and it seems webkit are doing the right thing, especially for automation
<AutomatedTester> ... I cant see a use case for the CDP way other thank creating flaky tests
<jgraham> s/doing the right thing/have the correct cache-disabling behaviour/
<jgraham> RRSAgent: make logs public
<RRSAgent> I have made the request, jgraham
<jgraham> RRSAgent: make minutes v2
<RRSAgent> I have made the request to generate https://www.w3.org/2020/10/28-webdriver-minutes.html jgraham
<AutomatedTester> Resolution: Create a command that takes a regex and then emits events and users can manipulate requests and responses based on the events sent. When an authentication request is intercepted we can insert credentials or deny them
css-meeting-bot commented 2 years ago

The Browser Testing and Tools Working Group just discussed Network request interception.

The full IRC log of that discussion <AutomatedTester> topic: Network request interception
<AutomatedTester> github: https://github.com/w3c/webdriver-bidi/issues/66
<AutomatedTester> jgraham (IRC): This is a more speculative item
<AutomatedTester> ... and I want to start work on the design so we can get it a little done over the next year
<AutomatedTester> ... we have talked about network logging to get some info
<AutomatedTester> ... but sometimes people want to intercept, modify and carry that on
<AutomatedTester> ... people want to intercept a request and send back a mocked response
<AutomatedTester> ... so I think this is a feature we should have
<AutomatedTester> q+
<AutomatedTester> ... so one thing here is that network interception can put a lot of pressure on the client to handle each
<sadym> q+
<AutomatedTester> ... and since some people test this locally and that's fine but we do have selenium users have this over the internet
<AutomatedTester> ... the proposal is there is an event for browsing context/realms and with some form of filter
<AutomatedTester> ... we have network events
<AutomatedTester> ... do we want separate events or use the logging and then return
<AutomatedTester> q?
<jgraham> AutomatedTester: My experience is that people using this feature in Selenium are confused about how this should/will work. For e2e testing this can be complex compared to lower level testing. Need to make it easier to use.
<jgraham> ... To the point about this being deployed over the internet, we can be talking about hundreds of requests that need to be intercepted. So prefiltering of events is a requirement.
<DevinRousso> q+
<jgraham> ... I'm concerned that this could cause a problematic number of events for Selenium in the cloud providers
<AutomatedTester> ack next
<AutomatedTester> ack next
<AutomatedTester> sadym (IRC): first question is add some kind of binding to the realm
<AutomatedTester> ... and try have this done in the browser and then return
<sadym> https://chromedevtools.github.io/devtools-protocol/tot/Network/#type-RequestPattern
<AutomatedTester> ... and then AutomatedTester_ (IRC) do you think that URL, resource type and interception stage is enough/
<jgraham> AutomatedTester: Yes, that seems fine. To the point about doing this on the remote end, if we can figure it out I'd prefer that kind of solution. Pushing everything onto the client is a problem if things are running in parallel; can end up with a huge number of simultaneous requests. Server side interception would fix that. But the filters are a good start.
<AutomatedTester> ack next
<AutomatedTester> Devin Rousso: within web inspector we have filters and it can even be at filtering on request/response
<AutomatedTester> ... webinspector doesnt have the resouirce type filtering
<AutomatedTester> sadym (IRC): it's useful for changing cats to dogs on a page
<AutomatedTester> jgraham (IRC): I can see resource filtering for all images then go for there
<AutomatedTester> Devin Rousso: in web inspector we only allow filtering on EITHER request or response...
<AutomatedTester> ... our reason is if you control one or the other then it's easier
<AutomatedTester> ... we still send out the requests in case the server requires it
<jgraham> q?
<AutomatedTester> ... it saves us trouble
<AutomatedTester> jgraham (IRC): web extensions has a similar aPI here
<AutomatedTester> ... does your api have overlap with web extensions
<AutomatedTester> Devin Rousso: there is some but I don't know if they have full overlap
<AutomatedTester> ... I don't have much knowledge of manifest v3
<AutomatedTester> Luka: in chrome they have deprecated the interception API
<AutomatedTester> jgraham (IRC): the stream API gives us what we need and we can manipulate it
<AutomatedTester> luka: [mentions how this works with streams]
<AutomatedTester> jgraham (IRC): [describes how things work in CDP]
<AutomatedTester> ... the main question is Do we start with how CDP works or how sadym (IRC) suggested that it happens in a sandbox specific way
<AutomatedTester> ... and I have a concern about how we this might work as we haven't got the primatives to make it work
<AutomatedTester> ... and we want to make sure we support clients that already here
<AutomatedTester> ... and it would be good to make sure we check how other groups are using it and then see if we can support them
<sadym> q+
<BrandonWalderman> q+
<AutomatedTester> AutomatedTester_ (IRC): I would vote for sadym (IRC) 's idea of putting this more onto the server
<BrandonWalderman> q-
<AutomatedTester> ack next
<AutomatedTester> sadym (IRC): i want to point out that it adds complexity to the server
<AutomatedTester> ... so it's harder to implement and spec out
<jgraham> `getResponseBody` is used in Puppeteer/Playwright but `takeResponseBodyAsStream` is not.
<jgraham> q+
<BrandonWalderman> q+
<AutomatedTester> automatedtester: My preference would be to add more complexity to this working group so that people can use this feature in a meaningful way rather than simplifying things for the sake of the spec and putting a lot of pressure on client authors and users and then no one uses the feature
<AutomatedTester> ack next
<AutomatedTester> jgraham (IRC): I am not convinced for clients that injecting Js would be easier for them
<AutomatedTester> ... there are perf reasons for injecting it all
<AutomatedTester> ... so for the first iteration that we all people to replace the responses but not modify on the fly
<AutomatedTester> q?
<AutomatedTester> ack next
<jgraham> s/all/allow/
<AutomatedTester> Brandon Walderman: I do see the value of pushing things on to the server side for perf and ease of client creation
<AutomatedTester> ... but I am hearing that there is concern of doing it in a sandbox
<AutomatedTester> ... I wonder if there is a middle ground here where we can do some form of [explains data structures]
<AutomatedTester> ... that way we can block things easily
<AutomatedTester> ... and if you want more advanced cases then we can do more CDP
<jgraham> q+
<AutomatedTester> ... and we can add the sandbox solution in the future
<AutomatedTester> ack next
<sadym> q+
<AutomatedTester> jgraham (IRC): I think it will add more work
<sadym> q-
<sadym> +1 to what James said
<BrandonWalderman> +1
<AutomatedTester> ... we can add a more declaritive later
<AutomatedTester> ... and that way we don't need to have a more complex JS execution
<AutomatedTester> ... it would be good to get a set of data from cloud providers here
<AutomatedTester> ... but build the low level APi and we can build a higher level API later
<AutomatedTester> q?
bobajeff commented 1 year ago

I'm interested in having the this feature being spec'd. As Webdriver doesn't currently have an equivelant to Network.getResponseBody. Which is important to me because I would like to use the feature in Firefox but it looks like their CDP implementation is maintenance mode in favor Webdriver bidi related APIs. Until then only a fork of firefox shipped with Playwright can get response body. Which I'm uncomfortable depending on.

jgraham commented 1 year ago

I've started a (thus far incomplete) draft of this feature at https://github.com/w3c/webdriver-bidi/compare/network_interception

The design at this point is as follows:

OrKoN commented 1 year ago

I have double checked the current implementation in Puppeteer and here are some thoughts:

1) it would be great to allow intercepting requests for a context and related contexts (e.g., an entire tab): seems to be not in the current draft? 2) Puppeteer only supports "modifying the request headers and body, setting the full response headers / body" without "changing the response headers" or "modifying the response body" parts. CDP supports the latter two as well. 3) Having multiple commands (network.continueRequest, network.fulfillRequest and network.failRequest) I think is more clear but combining them into one command would also work. 4) WDYT about isPaused or isIntercepted instead of isBlocked for the indication of the interception? we could also rely on the presence/absence of intercepts, I assume?

css-meeting-bot commented 1 year ago

The Browser Testing and Tools Working Group just discussed Network request interception.

The full IRC log of that discussion <jgraham_> Topic: Network request interception
<jgraham_> github: https://github.com/w3c/webdriver-bidi/issues/66
<jgraham_> jgraham: (tries to summarise what's in the comment at https://github.com/w3c/webdriver-bidi/issues/66#issuecomment-1542080588)
<jgraham_> q?
<simons> q+
<jgraham_> ack simons
<orkon> q+
<jgraham_> simons: We will need to set the response body quite often, and we'll need to block outgoing requests. I see you say that changing the response might be difficult, but we need that option.
<jgraham_> simons: Notably the way people use this is to mock out backend infrastructure and provide canned responses without having to have the backend running.
<jgraham_> simons: The CDP model feels OK with continueRequest and fulfillRequest; it's not too hard to use.
<jgraham_> jgraham: Replacing the network response is supported, but what's not supported at the moment is editing a response from the network as it arrives
<jgraham_> simons: People do sometimes want to do that e.g. for record/replay
<jgraham_> jgraham: You could also have a way to just get the entire body at the end.
<jgraham_> simons: Yes. We could add support for modifying the response later
<jgraham_> ack orkon
<jgraham_> orkon: CDP allows fetching the response and providing a different response. I agree we could start without that and see how much interest there is at a later point.
<jgraham_> orkon: Are interceptions per context or global?
<jgraham_> jgraham: From an API point of view it seems to be easy to limit to a specific context
<jgraham_> orkon: What about service workers or iframes? In CDP it's difficult to configure interceptions for new targets. Would be good to specify it per tab.
<jgraham_> jgraham: We have a similar problem for events, we could reuse that mechanism
<jgraham_> orkon: It can be difficult with iframes. We need some flexibility
<jgraham_> RRSAgent: make minutes
<RRSAgent> I have made the request to generate https://www.w3.org/2023/05/10-webdriver-minutes.html jgraham_
jgraham commented 1 year ago

https://github.com/w3c/webdriver-bidi/pull/429

christian-bromann commented 6 months ago

Has there been a consideration of enhancing this feature to allow intercepting WebSocket messages?

whimboo commented 4 months ago

Has there been a consideration of enhancing this feature to allow intercepting WebSocket messages?

Would you mind filing this as a new enhancement issue request?

This issue should basically be done, or is there something missing @jgraham?

whimboo commented 4 months ago

Going to mark this issue as done based on the merge of #429 a while ago.