[REGRESSION]: chromium.connect does not work with vanilla CDP servers anymore

berstend commented 4 years ago

Hi there, apologies if this has been asked before but I didn't find anything.

Up until v1.3.0 it was possible to .connect to a vanilla browser websocket URL (either created through launching Chrome with --remote-debugging) or using Puppeteer).

v1.4.0 introduced the new "New Client / Server Wire protocol", since then it's not possible to use .connect with other original CDP websocket servers anymore (it will hang here). I also noticed through logging that the new wire protocol looks much different from the regular CDP chatter.

My question would be if there's still a way to use chromium.connect with the traditional CDP websocket URLs or if there are pointers on how to best write a translation layer to accomplish that.

My use-case is to have a single websocket url which exposes a fleet of chromium browsers by proxying the websocket connections (similar to browserless.io) and both puppeteer & playwright being able to connect to it and control the browsers.

The following worked prior to v1.4.0:

const puppeteer = require("puppeteer")

puppeteer
  .launch({
    headless: true,
    defaultViewport: null,
    args: [
      "--remote-debugging-port=9222",
      "--remote-debugging-address=0.0.0.0",
    ],
  })
  .then(async (browser) => {
    console.log(browser._connection.url())
  })
// => ws://0.0.0.0:9222/devtools/browser/490a2b32-ce62-4b6b-a530-d4931fdeb046

const pw = require("playwright")

;(async () => {
  const browser = await pw.chromium.connect({
    wsEndpoint: "ws://0.0.0.0:9222/devtools/browser/490a2b32-ce62-4b6b-a530-d4931fdeb046",
  })

  const context = await browser.newContext()
  const page = await context.newPage()

  await page.goto("https://www.example.com/")
  await page.screenshot({ path: "example.png" })

  await browser.close()
})()

dgozman commented 4 years ago

Hi @berstend!

The underlying protocol has indeed changed in 1.4, and we are aware that this usecase is not supported anymore. However, it was never a goal for Playwright to work against CDP-based remote websockets. Playwright supports multiple browsers, and we try for all features to be equally supported by all of them. Our new protocol is the same for all browsers we support.

I'd suggest to use Playwright launchServer API to spawn the server and proxy its websocket connection. Does that work for you?

My question would be if there's still a way to use chromium.connect with the traditional CDP websocket URLs or if there are pointers on how to best write a translation layer to accomplish that.

There is no API to connect to CDP websocket directly. A translation layer would be equivalent to ~50% of Playwright code base, so it does not look feasible.

berstend commented 4 years ago

@dgozman Thanks a lot for the quick response on that matter. :-)

Oh wow, that's unfortunate to hear. The CDP websocket has always felt like the underlying and unifying standard or common denominator that all CDP based automation frameworks could interface with.

Note: I really appreciate the efforts of the Playwright team and the project contributors and can understand why switching to a custom wire protocol was desired.

Having that said, I wonder what the best course of action would be if someone would be determined enough to develop a custom websocket server compatible with Playwright wire protocol and vanilla CDP clients.

In my scenario I have full control over the websocket server consumed by pptr/pw clients but the server itself has to connect to existing browsers through vanilla CDP (playwright cannot be used to launch them).

Would this be feasible:

Modify launchServer to not launch a browser but use an existing one through CDP (should be possible as internally Playwright still uses the vanilla CDP websocket)
Develop a custom websocket server that can differentiate between Playwright and Puppeteer clients (through e.g. the initial handshake or similar) and forward the websocket frames to either the custom Playwright websocket or the vanilla CDP socket

Thanks a lot for your time and input, much appreciated. :-)

pavelfeldman commented 4 years ago

playwright cannot be used to launch them

Could you share why this is the case? We assumed you would simply run browser.launchServer and that would be it?

CDP websocket has always felt like the underlying and unifying standard or common denominator that all CDP based automation frameworks could interface with.

Since we are more than just Chromium, it does not really work for us. Having said that, you can modify launchServer locally to attach over CDP instead of launching Chromium. Things will work, but I wonder if we can support your use case with launchServer out of the box!

pavelfeldman commented 4 years ago

Btw, if you join #rpc channel on our Slack (link in our main README.md), we can discuss this and other topics there!

joelgriffith commented 4 years ago

We landed support for this in https://github.com/browserless/chrome/pull/1018. There's a new /playwright route that we've added, and it forks the default WebSocket behavior of proxying through to CDP, and instead proxies over into the browserServer endpoint. Some things about browserless don't yet work, like "head-full" and --user-data-dir's since it's not supported in the browserServer methods. We'll add those back in over time once we get broader playwright support on our stack.

You can see the change in behavior here: https://github.com/browserless/chrome/blob/master/src/puppeteer-provider.ts#L400

Niek commented 4 years ago

@pavelfeldman Not OP, but I want to highlight that there are various reasons to connect to existing Chrome CDP websockets that are not launched by Playwright. For example, if you have a scalable Chrome (e.g. k8s) cluster that already works with Puppeteer/Playwright. Or if you use Playwright with a third party infra solution like Browserless (see comment above of @joelgriffith) or HeadlessTesting.

It would be awesome if there would be a way to call launchServer() with an existing CDP websocket for these usecases.

pavelfeldman commented 4 years ago

This all makes sense.

When we are reasoning about it internally, we refer to the model where the k8s cluster includes Playwright Server and uses it to launch the browser. I believe Selenoid is now doing that already. This enables us (and you, cloud providers) operate much more efficiently:

We support non-Chromium browsers uniformly. Switching to this model gives WebKit and Firefox for free
Protocol traffic drops, instead of 4-8 roundtrips for each click you get 1.
We can be efficient with the resources written by the browser, including downloads, recently added videos and work-in-progress traces. They reside next to the browser, are accessible to the Playwright server and can be exposed to the user over the wire.

So putting Playwright server into the cloud sounds like a clear win to us, by good margin. If there is some code that you are using to manage the executed instances that is missing from the Playwright Server, we would be happy to consider adding it upstream.

We understand that it isn't free: solutions that support both Puppeteer and Playwright would need to maintain different code paths to support both. We also realize that it isn't feature complete (launchPersistent is not supported by connect currently). But we think multi-browser support, videos, traces and all the new features this structure enables us to do are totally worth it.

joelgriffith commented 4 years ago

@pavelfeldman thanks for going over this. I think the things you've outlined make a lot of sense, in particular the network traffic (an almost 10x drop in network chatter is huge). This is a big one for us.

For us, the problem is maintenance burden as we support playwright, puppeteer, selenium as well as our own REST APIs. Adding the new playwright server was quick, but over the long term the cost does add up, so it's not a trivial change. As much as the change was a bit of a shock, I think it's worth the burden given what you've said.

A few things that would be nice:

Support for user-data-dir in browser-server. Not sure what the complications are there since it's not part of the current API.
Allow for a newContext event so we can inject network interceptors for ad-blocking in our stack: https://github.com/browserless/chrome/blob/master/src/chrome-helper.ts#L139. There's a few other things as well here, but it sounds like playwright takes care of them mostly, so just this ad-blocker would be nice.

berstend commented 4 years ago

@joelgriffith congrats on your very fast fix :-)

My use-case differs a bit from browserless, in the sense that I can run playwright on the server but I cannot use the current launchServer to spawn browsers locally as they're remote browser instances that are already running.

If there is some code that you are using to manage the executed instances that is missing from the Playwright Server, we would be happy to consider adding it upstream.

@pavelfeldman Would the Playwright project be open to upstream patches for the following:

Extend launchServer with additional support for connecting to an existing CDP session (e.g. by adding options.cdpSession)
Add a user-agent: playwright/${version} request header to the internal cdpSession Websocket request (ws supports setting those, the change would be trivial) - I currently have a PoC that can differentiate between Playwright/Puppeteer clients by analyzing the first websocket frame, but looking at a user-agent would be much more robust 😄

Happy to work on those changes (they should be side-effect free) and implement them in a way that fits the Playwright code style & naming preferences.

edit, just to visualize what those changes would allow me to do 😄:

                   +------------+                             +---------------+
pptr client  +-->  |            |  +----------------------->  |               |
                   |  custom    |                             |  Chromium CDP |
                   |  websocket |        playwright           |  websocket    |
  pw client  +-->  |            |  +-->  wire protocol  +-->  |               |
                   +------------+        websocket            +---------------+

dgozman commented 4 years ago

Support for user-data-dir in browser-server. Not sure what the complications are there since it's not part of the current API.

Do you mind sharing the usecase for this one? IIUC, the user data dir won't be accessible to any remote client, because it lives on the server, so how is it intended to be used?

Also note that Playwright assumes exclusive access to the browser context to implement all kinds of context-wide method, e.g. routes. Since there is just a single persistent context that makes use of the user data dir, perhaps there is/should be a separate mechanism that ensures exclusive access to this persistent context between clients?

Allow for a newContext event so we can inject network interceptors for ad-blocking in our stack: https://github.com/browserless/chrome/blob/master/src/chrome-helper.ts#L139. There's a few other things as well here, but it sounds like playwright takes care of them mostly, so just this ad-blocker would be nice.

Are you talking about something like browser.on('context', (context: BrowserContext) => {}) event?

dgozman commented 4 years ago

Extend launchServer with additional support for connecting to an existing CDP session (e.g. by adding options.cdpSession)

I think we'll be open to something like that. I'd prefer options.cdpWebsocketEndpoint name, sorry for bikeshedding 😄

Add a user-agent: playwright/${version} request header to the internal cdpSession Websocket request (ws supports setting those, the change would be trivial) - I currently have a PoC that can differentiate between Playwright/Puppeteer clients by analyzing the first websocket frame, but looking at a user-agent would be much more robust 😄

Are you talking about sending this header when connecting to the new CDP webscoket from the launchServer? Sounds good to me.

berstend commented 4 years ago

I'd prefer options.cdpWebsocketEndpoint name, sorry for bikeshedding 😄

Haha, no worries - no strong opinions on the naming here (hence me asking in advance) 😄 cdpWebsocketEndpoint sounds good.

Are you talking about sending this header when connecting to the new CDP webscoket from the launchServer? Sounds good to me.

Yup, I'll be also checking with the puppeteer team to add a similar patch to their CDP websocket client.

Great, I'm gonna get familiar with the playwright build/test setup and submit a draft PR once I find an elegant way to add cdpWebsocketEndpoint support to launchServer.

Thanks for the feedback!

headlesstesting commented 3 years ago

Hi,

We're in the same boat as @berstend and are eagerly awaiting the possibility for a cdpWebsocketEndpoint option as well as a user-agent option (https://github.com/microsoft/playwright/pull/1729)

People using HeadlessTesting.com can not use Playwright v1.4.0 with our service due to this change. If there's anything we can do to help, please let us know.

ZeroBrowser commented 3 years ago

We are also advising our users to not use version v1.4.0 or higher at 0browser.com. Please let us know if we can help with anything. Otherwise similar to browserless we have to create a dedicated endpoint for Playwright. Thanks!

Niek commented 3 years ago

Is there any progress on reviewing https://github.com/microsoft/playwright/pull/4344 @dgozman?

mxschmitt commented 3 years ago

Closing, since there is now BrowserType.connectOverCDP since a few releases: https://playwright.dev/docs/api/class-browsertype#browsertypeconnectovercdpparams

microsoft / playwright

[REGRESSION]: chromium.connect does not work with vanilla CDP servers anymore #4054