w3c / webdriver-bidi

Bidirectional WebDriver protocol for browser automation
https://w3c.github.io/webdriver-bidi/
378 stars 42 forks source link

Clarify network events received for authentication attempts #722

Open juliandescottes opened 6 months ago

juliandescottes commented 6 months ago

Assuming the client listens to network events, consider a basic http authentication flow where the user first performs a request and then provides credentials (either manually via the prompt, or using interception+continueWithAuth).

At the moment in Firefox the user will receive the following events:

So we are getting two sets of network events, one for the original request and one for the authentication request. For now in Firefox those 2 requests have the exact same id.

Looking at the fetch spec, the authentication request is triggered from:

Set response to the result of running HTTP-network-or-cache fetch given fetchParams and true.

(step 14.4 of https://fetch.spec.whatwg.org/#http-network-or-cache-fetch)

So spec wise we are actually reusing the same request object, and we would get beforeRequestSent/responseStarted again for the same request id as the original request. And still according to the spec we should only get a single responseCompleted event once the authentication is done.

Are we happy with that model, where we will get N beforeRequestSent/responseStarted/authRequired events for a single responseCompleted and a single request id? Or do we want to update the spec to have unique request ids and/or individual responseCompleted events.

@OrKoN Do you know how Chrome behaves in this scenario and which events it will emit?

annevk commented 5 months ago

I was wondering about this in the context of redirects as well, where it seems like the same thing will happen?

And this might also happen with CORS preflights. I'm not sure if that scenario is tested?

OrKoN commented 5 months ago

Chrome currently handles the events in the following way (when the authRequired is not subscribed to, and dialog is shown to the user):

And when the authentication is automated via interception (subscribed to the auth request with the interception):

I will need to also check the case when the event is subscribed to but the interception is off.

jgraham commented 5 months ago

I was wondering about this in the context of redirects as well

For redirects this is intentional; the idea is that request id refers to the entire redirect chain and we keep a redirect count to distinguish specific individual requests.

Since CORS preflight creates a totally new request object I think it will look like a different, unrelated request, which is kind of the opposite problem to this one. The problem with HTTP Auth is that in fetch we reuse the same request object for both the initial request and the request with credentials.

annevk commented 5 months ago

Hmm so when you're talking about "request objects" it seems like you're talking about the request concept in Fetch, which is mostly just a specification construct. And whether we create a new one or reuse one is mostly an artifact of what seemed convenient and not necessarily the best API. (Now perhaps it is the best API, but wanted to make sure you realize that.)

juliandescottes commented 5 months ago

Right we've been relying on the fetch request concept to decide whether or not we should create new requests / request ids. If this concept is not really something stable we can use as an API, then we either need to make it clear in fetch or to use something else?

So far it has worked pretty well, but as we see here we have scenarios where some requests will not exactly follow the standard network event lifecycle: beforeRequestSend -> responseStarted -> authRequired (optional) -> responseCompleted or fetchError

That being said we have not really decided if this should really be identical for all network events or if there could be exceptions.

From a BiDi user perspective, it feels that dealing with several network events with the same name for the same request id would be complicated. If we don't want to use another id, maybe we should do the same as for redirects and introduce an authenticationAttempt counter?

annevk commented 5 months ago

I think I would prefer you all to figure out when you want new request IDs independently from the request concept. Once we land the request IDs in Fetch we'll obviously keep them stable from then on out.

If request IDs survive redirects, I think it would make sense for them to survive CORS preflight and HTTP authentication as well. Those are all "additional requests" somewhat internal to the networking layer. Having said that, I'm not familiar with all the use cases so perhaps there are good reasons to deviate here and there.

juliandescottes commented 4 months ago

One thing to note, as I was trying to generate different ids for each authentication attempt: this leads to bad behaviors with puppeteer's page.authenticate when used with wrong credentials.

For instance the test [network.spec] network Page.authenticate should allow disable authentication tries to check that a request with wrong credentials ends up in a 401. However the way page.authenticate works in puppeteer is that you configure a username / password for a page, and all network.authRequired events for that page will then lead puppeteer to automatically send a network.continueWithAuth command with the page's username/password. And when I start generating different ids for each authentication attempt, puppeteer will keep sending continueWithAuth commands forever.

Maybe that's another good reason for sticking with the same request id and have a counter on the request.

jgraham commented 4 months ago

My main concern with having a counter is that we're making the actual key you have to construct to match requests and responses rather complex, like (request id, redirect count, auth attempt count, is cors preflight), and perhaps more in the future if we add more cases where a single network request generates multiple on-the-wire requests (as we did with CORS).

I slightly wonder if we want to add that data so that you can track the reason for this request, but also have an id that is unique for a single on-the-wire request/response, rather than an entire sequence of requests.

juliandescottes commented 4 months ago

I agree with the concern about the composite key issue. For clients such as puppeteer, I think having a counter is sufficient to stop doing unnecessary attempts (eg if authenticationAttempt > 1 on the authRequired event, do not continueWithAuth, because it already failed).

css-meeting-bot commented 4 months ago

The Browser Testing and Tools Working Group just discussed Network request ids for auth + CORS preflight, and agreed to the following:

The full IRC log of that discussion <AutomatedTester> topic: Network request ids for auth + CORS preflight
<AutomatedTester> github: https://github.com/w3c/webdriver-bidi/issues/722
<AutomatedTester> jgraham: for netwroking intercepts and logging. In the case of redirects we would have a reuqest id that was the same for the whole chain or requests
<AutomatedTester> ... what we didn't consider was http auth and cors preflight
<AutomatedTester> ... for cors preflight it looks like a separate reuqest
<AutomatedTester> ... for http auth it looks like the original requests
<AutomatedTester> .... the spec handles this in 3 different ways in the spec
<AutomatedTester> ... this has been discussed in the issue. It is unclear what we should do next?
<AutomatedTester> ... <explains different ways with request ids, added fields on the objects>
<orkon> q+
<AutomatedTester> ... I would like to know if people have opinions on what approach and what invariants we want to support
<AutomatedTester> ack next
<AutomatedTester> orkon: I haven't looked into the issue before the meeting. In the puppeteer we have always considered cors to have separate ids
<AutomatedTester> ... if there is auth we have and the dialog is handled and the same id is in the events
<AutomatedTester> ... if there is an error we expect a new request id
<AutomatedTester> ... looking at puppeteer we do a mix of wha tthe spec has
<AutomatedTester> jgraham: for cors request preflight can be handled as a special case
<AutomatedTester> ... for auth we should treat it like a redirect
<AutomatedTester> ... which appears to be different to what puppeteer does with cdp. I think jdescottes proposal seems to handle this scenario
<orkon> q+
<AutomatedTester> ... our concern is if there is a auth and the credentials are wrong and it gets itnto a loop doing that over and over
<AutomatedTester> ack next
<AutomatedTester> orkon: if we add the auth count is it for per redirect or for the whole chain?
<AutomatedTester> ... if we add a parameter that could cause problems down the line
<AutomatedTester> jgraham: I agree a multipart key makes it more complex
<AutomatedTester> ... and if chain has the count can handle the count
<AutomatedTester> q?
<jdescottes> q+
<AutomatedTester> ack next
<AutomatedTester> jdescottes: I was thinking of another design
<AutomatedTester> ... one thing we can do is always have a unique id
<AutomatedTester> ... and we could have a parent that knows where it could be from and where it is going. E.g. parent for redirect or parent for auth
<AutomatedTester> jgraham: if we do that it creates a new set of invariants and breaks clients
<AutomatedTester> ACTION: Please read the issue and leave comments (all)