w3c / webdriver-bidi

Bidirectional WebDriver protocol for browser automation
https://w3c.github.io/webdriver-bidi/
375 stars 42 forks source link

session.Status command doesn't really make sense #46

Closed jgraham closed 2 years ago

jgraham commented 4 years ago

The way the spec is set up at the moment, it's not possible to have a WebSocket connection without a session. So the session status command doesn't make sense because it's always going to say yes there's a session.

For this to make sense there needs to be a way to start the websocket listener without starting a WebDriver/HTTP session.

foolip commented 3 years ago

@sadym-chromium and I noticed this yesterday as well when walking though https://w3c.github.io/webdriver-bidi/#commands-status and https://w3c.github.io/webdriver/#dfn-readiness-state (which isn't linked, oops).

It's defined as:

The readiness state of a remote end indicates whether it is free to accept new connections. It must be false if the maximum active sessions is equal to the length of the list of active sessions, or if the node is an intermediary node and is known to be in a state in which attempting to create new sessions would fail. In all other cases it must be true.

But this is talking about readiness for a POST /session HTTP request, and is sort of irrelevant within a single WebDriver BiDi connection.

Should we just remove this command, since it was mostly added to prove out a command? Or do we want a no-op "ping" message in BiDi, to keep the connection alive or something? (This already exists at probably multiple of the underlying layers, but is arguably harder to use since it punches through layers of abstraction.)

sadym-chromium commented 3 years ago

The session.status command can be used as an e2e ping. Meaning it returns true, if the the session has an active connection to the browser.

Currently we have 2 connections: client to WebDriverBiDiServer, and WebDriverBiDiServer to the browser itself. Having client <-> WebDriverBiDiServer connection doesn't mean the connection WebDriverBiDiServer <-> browser is healthy.

We can implement 2 approaches:

  1. Make the session.status command return the state of the WebDriverBiDiServer <-> browser connection. And send a special event in case of the connection to the browser is dropped.
  2. Drop client <-> WebDriverBiDiServer connection in case of the WebDriverBiDiServer <-> browser connection dropped.

I would recommend to use the first option with session.status command, as soon as it keeps the workflow more consistent.

Here is a prototype of the session.status command: processSessionStatus(...)

The documentation needs to be adjusted accordingly after the decision is made.

foolip commented 3 years ago

Treating session.status as an e2e ping makes sense to me, since it's not possible to use the WebSocket ping for this, or at least it would require unconventional custom handling of ping frames.

Regarding when (if ever) session.status should return false, and whether there should be a session.statusChange or similar event, one principle I'm thinking of is making stuff between the automation client and the browser as transparent as possible, so that both of these are possible future implementation choices:

+--------+   +---------------------+   +---------+
| client |<->| bidi-capable server |<->| browser |
+--------+   +---------------------+   +---------+

+--------+   +----------------------+
| client |<->| bidi-capable browser |
+--------+   +----------------------+

Note that this isn't a design principle that's been explicitly discussed and it might be flawed, but in this case it would be an argument for the second option, "Drop client <-> WebDriverBiDiServer connection in case of the WebDriverBiDiServer <-> browser connection dropped."

@sadym-chromium WDYT?

foolip commented 3 years ago

There are some related questions here which would be great to answer:

Even if the answer is that a session is by definition always ready while the connection is open, an e2e ping still seems to serve some practical utility, so I'm not argument that we should remove it. But maybe it should be defined to always return true if we can't define the case where it would not be ready.

jgraham commented 3 years ago

I don't think that there's a requirement that the connection has multiple hops (I don't expect the gecko implementation to have multiple hops in the default case). There may also be middleware that adds additional hops to the connection.

I think in general if one part of the the route drops the connection it makes sense for the connection to be dropped everywhere. Having an event corresponding to that seems troublesome because it can't be reliably delivered (consider the case where the node closest to the client crashes).

My opinion is we ought to drop this endpoint and if someone wants to resurrect it propose it as a new feature rather than working from a position where it already exists and we hunt around looking for a use case.

sadym-chromium commented 3 years ago

The main concern in dropping connection is that it can be harder to debug the crash reason. But the message with the crash details sent to client before dropping the connection solves the debugging problem, and makes the protocol more logical: No possibility to work with browser - no connection.

So I'd go for an option:

In case of browser crashes, (send a debug message if possible) and drop the connection.

foolip commented 3 years ago

I don't think that there's a requirement that the connection has multiple hops (I don't expect the gecko implementation to have multiple hops in the default case). There may also be middleware that adds additional hops to the connection.

@jgraham glad to hear it, I've filed https://github.com/w3c/webdriver-bidi/issues/78 for that and some other things we may end up agreeing on :)

@sadym-chromium dealing with crashes is currently pretty special in testing setups, I think basically you have to launch the binary yourself and monitor its exit code and read its stdout/stderr, and that none of this is standardized.

For the case that the code implementing BiDi itself has crashed there's no possibility of the protocol helping, but other cases like renderer crashes, perhaps this would make sense. Can you file an issue about dealing with crashes?

foolip commented 3 years ago

Regarding the fate of session.status, the use case of an e2e ping seems valid to me, but I'm not sure how strong the need is or what will eventually break if we don't have it. Perhaps omitting it for now and considering ping as a separate command makes sense? Hopefully it will not be long before we have another command we can use for testing :)

whimboo commented 2 years ago

For this to make sense there needs to be a way to start the websocket listener without starting a WebDriver/HTTP session.

This is now possible with the session.new command so that WebDriver BiDi only clients can directly create a new session without having to take the HTTP route. So I assume we can re-think the usefulness and keep this command?

jgraham commented 2 years ago

I think the problem now is that "readiness state" isn't really defined (maybe it's just supposed to be a link to https://w3c.github.io/webdriver/#dfn-readiness-state ) I think we should fix that and close this issue.

whimboo commented 2 years ago

PR #163 has been merged. As @jgraham said lets close this issue.