w3c / webdriver

Remote control interface that enables introspection and control of user agents.
https://w3c.github.io/webdriver/
Other
684 stars 195 forks source link

Machine readable endpoint definitions #1510

Open foolip opened 4 years ago

foolip commented 4 years ago

Machine readable definitions in specs (or out-of-band) makes it possible for tools to use that information for a variety of purposes:

Possible additional benefits depending on the solution:

Previous discussions is in https://github.com/w3c/webdriver/issues/1462#issuecomment-634663352:

@christian-bromann:

@foolip it would be great if this protocol could be machine readable as it would simplify the creation of rudimental bindings that can be used to build a framework with higher level abstractions as well as for documentation. For example: the Chrome DevTools protocol is maintained as a pure json file and is used to build its documentation page as well as CDP modules like chrome-remote-interface. This can safe a lot of time and duplication for adopters building packages around the protocol.

Another example is from the WebdriverIO project where I've defined a custom JSON format of the WebDriver spec that is used to autogenerate a minimal functional binding. While I think a spec text is not necessarily suited as user documentation - being able to have the ability for type checks especially in the JS ecosystem would be very beneficial.

A while ago I tried to convert the current WebDriver protocol into an OpenAPI spec and was able to generate a page like this: https://webdriver.github.io/webdriver/. I can imagine that it should be not that difficult to build a bikeshed pre processor that reads an OpenAPI spec and coverts it into a spec document.

I would be happy to help contribute and drive the effort for a machine readable spec if you all think it is worthwhile.

@bwalderman:

I played around with OpenRPC when writing the explainer. There's an example here. I ran into some minor issues. The spec supports only one-way calls (no concept of bidi) so I ended up using tags to mark things as either "events" or "commands". The tooling support also seems pretty limited. There's a code gen tool but this supports only TypeScript and Rust at the moment.

AsyncAPI looks promising instead. It's more generic and can express both HTTP and WebSocket transports in the same file. It also has proper support for bidi communication through "channels".

Also briefly discussed in the 27 May 2020 meeting:

@christian-bromann: it would be good if we can have the definitions of API to be machine readable ... I would like to maintain an OpenAPI version of the spec at least

@bwalderman: having a machine readable version of the spec is great, I am not familiar with OpenAPI

@jgraham: the machine readable version should be normative items to simplify things

@foolip: being able to extract this is great, but we can go patch BS if we need to, Tab will likely handle it

On that very last line, I'd like to clarify that I meant that @tabatkins would like accept patches, not that he'd do all the work for us :)

foolip commented 4 years ago

I've filed https://github.com/w3c/webdriver-bidi/issues/21 for the BiDi side. The two specs have been discussed together at certain points, including my summary of this issue, but I'm hoping that splitting the issues will help.

jgraham commented 4 years ago

All of these api definition formats seem to use JSON Schema for the actual definitions. I'm not convineced that we really care about the value add of the additional layers on top of that; from a skim it looks like the additional features are about service discovery and licensing, which I don't think we particularly care about. In particular I see the following as use cases for machine-readable defintions in the spec:

I see the following as non-goals:

So I don't think we want endpoints that produce schema documents to allow clients to introspect the API or anything; in practice all the WebDriver and CDP clients are providing significant value-add over the mechanical conversion of protocol endpoints into code, and in any case updates to the spec will be accompanied by updates to the published schema, so we don't also need to allow introspection.

Given that, I think we should just write json schema directly and not try to adopt any of the higher layer stuff like [Async|Open]API which afaict are mostly addressing needs we don't have.

foolip commented 4 years ago

I agree on introspection, or doing something useful "without specific understanding of the protocol semantics", not being goals, at least not for me.

I guess there must be a bunch formalism already for defining REST APIs, but I don't know if there's anything suitable for inlining in a spec, and certainly nothing that already works in Bikeshed or ReSpec. I think the minimum level of formalism for endpoints to be useful would be:

Stuff that's useful for clients of the API I have much less experience with.

Some of this will probably involve JSON, and https://github.com/whatwg/infra/issues/159 is relevant there.

jgraham commented 4 years ago

Ah, I was mostly talking about the bidi case; I got the wrong version of this issue :) I'll copy my comment over. I'm somewhat less worried about the HTTP case if only because I think it's less of a priority to change in the short term than it is to do work on BiDi. Once we have the experience of doing the JSON parts for BiDi we can move the HTTP protocol to that, and then find something that allows us to redefine the endpoints. But making such a change normative is almost a complete rewrite of the spec, so probably not sensible until such a time as we want to change the HTTP WebDriver to formally use the BiDi protocol internally.

foolip commented 4 years ago

Yep, I also see it as a much higher priority to have some formalisms for BiDi.

praveendvd commented 3 years ago

yes this will be really useful for collaboration and understanding of w3c endpoints