clj-commons / etaoin

Pure Clojure Webdriver protocol implementation
https://cljdoc.org/d/etaoin
Eclipse Public License 1.0
912 stars 95 forks source link

Do we need to move to newer capabilities settings? #467

Closed lread closed 2 months ago

lread commented 2 years ago

Currently

While taking a peek a WebDriver logs, I noticed firefox's geckodriver emitting:

1656188404728   webdriver::command  WARN    You are using deprecated legacy session negotiation patterns (desiredCapabilities/requiredCapabilities), see https://developer.mozilla.org/en-US/docs/Web/WebDriver/Capabilities#Legacy

So maybe...

We should have a look-see and see if we should be specifying capabilities in a different way these days.

Action

I'll eventually follow up.

lread commented 1 year ago

Just noticed this from ChromeDriver when looking at #519:

17:36:16.436 WARN [ProtocolHandshake.createSession] - Support for Legacy Capabilities is deprecated; You are sending the following invalid capabilities: [chromeOptions]; Please update to W3C Syntax: https://www.selenium.dev/blog/2022/legacy-protocol-support/

So that's drivers for both Firefox and Chrome emitting warnings.

I'd have to check, but the Edge driver probably does the same.

But the Safari driver is a bit of an oddball, before we update, I'd have to check we aren't breaking it.

lread commented 4 months ago

Here's a pretty good explanation of the current syntax: https://github.com/w3c/webdriver/issues/1215#issuecomment-362266991

If I understand correctly, we could probably auto-convert the legacy requiredCapabilties + desiredCapabilties to the current syntax.

Using edn and pseudo-capabilties:

{:requiredCapabilities {:some-capability :must-be-true
                        :some-config :here}
 :desiredCapabilties {:some-other-capability :should-be-preferred}}

Could become:

{:capabilities 
  {:alwaysMatch {:some-capability :must-be-true
                 :some-config :here}
   :firstMatch [{:some-other-capability :should-be-preferred}]}
lread commented 4 months ago

The current state of Etaoin is:

Etaoin documents that the user can specify :capabilities, but it always sends them as :desiredCapabilities on session creation. (Unless on safari, where they are sent as :capabilities (we can check if this is still a thing)).

So Etaoin does not explicitly expose the concept of required and desired capabilities.

Some thoughts

The alwaysMatch and firstMatch syntax implies that there are a bunch of candidate web drivers to match against. I think this also implies talking to a Selenium Grid WebDriver #378, which we do not support yet.

The capabilities spec design is a bit odd in that it mixes user-provided config with webdriver selection criteria. Etaoin is focused on the user-provided config aspect.

In the near term, we can experiment with using the new format while maintaining the existing behaviour. We'll see if plunking the Etaoin user-specified :capabilities under :capabilities -> :firstMatch works out on our session create payload.

Optionally, we can consider: if the user specifies :alwaysMatch and/or :firstMatch keys under :capabilties just send the capabilities as is.

lread commented 2 months ago

GeckoDriver 0.35.0 was just released with the following change:

Removed support for session negotiation using the deprecated desiredCapabilities and requiredCapabilities.

So, I'll get to this one very soon.

lread commented 2 months ago

Oh it looks like Etaoin is ultra-legacy here.

We are expressing, for example, loggingPrefs which technically pre-dates the W3C WebDriver standard was part JSON Wire Protocol era and might have been an informal and commonly supported naming rather than part of the spec?

I did find it specifically documented via the wayback machine.

So.. the new way, if I understand it, is vendor-specific.

lread commented 2 months ago

Ok, moving to the new capabilities syntax puts chromedriver in a different mode. Some existing API calls (like get-logs) fail with Cannot call non W3C standard command while in W3C mode. You'd think W3C mode might mean that you can only call official W3C spec endpoints. You might also discover a w3c boolean vendor option which might lead you down a rabbit hole.

After too much poking around, my initial theory is that W3C mode could maybe be thought of as non-legacy-api-mode. Chromedriver-specific endpoints work, but you need to use the non-legacy-endpoints. For example if I update get-logs URL (an endpoint which is not in the W3C webdriver spec) I can get it to work.

Google documents its chromedriver API at the Selenium level, which is not terribly helpful to us. To discover the REST endpoints, I'm referring to the source code. This python client is helpful:

lread commented 2 months ago

Oh, here's maybe it: The chromedriver "W3C mode" probably is a badly named replacement for the old JSON Wire Protocol. Maybe.

lread commented 2 months ago

Sooo... my guess for the chromedriver Cannot call non W3C standard command while in W3C mode error is: hey, you said you wanted to work in w3c mode, there is a w3c webdriver spec alternative to the legacy API you are calling, use that instead.

lread commented 2 months ago

Ok, we need a breaking change to move to the w3 spec.

The w3c spec implements keyboard and mouse interaction through "actions". A sequence of actions is expected to be sent to the WebDriver and be run as a transaction. Etaoin's drag-and-drop feature is implemented with actions, so it works nicely.

But Etaoin also exposes (all chrome specific unless otherwise noted)

These features cannot be migrated as is to WebDriver actions because, in the W3C spec "actions" world, they represent individual steps in a transaction and, as such, do not work well if sent to the WebDriver independently.

They will be deleted.