WordPress / wordpress-playground

Run WordPress in the browser via WebAssembly PHP
https://w.org/playground/
GNU General Public License v2.0
1.65k stars 264 forks source link

Network access in the browser #85

Open adamziel opened 1 year ago

adamziel commented 1 year ago

Latest status

Curl and tcp over fetch() are now a part of WordPress Playground :tada: Here's what we still need to close this issue:


Description

WordPress Playground only has a partial support for network calls.

Types of network calls in WordPress

wp_safe_remote_get

As of https://github.com/WordPress/wordpress-playground/pull/724, Playground is capable of translating wp_safe_remote_get calls into JavaScript fetch() requests. This has limitations:

Arbitrary network calls

Other methods of accessing the network, such as libcurl or file_get_contents, are not supported yet.

Web browsers do not allow the WebAssembly code to access the internet directly yet. A native socket API may or may not be released in the future, but there isn't one for now. #1093 would improve the situation.

In Node.js, Playground access the network using the following method:

  1. Set up a same-domain API endpoint that accepts network commands from the browser
  2. Capture socket function calls in the WebAssembly binary
  3. Pass them to JavaScript
  4. Pass the requested operation over the API endpoint using the fetch() or WebSocket

This may not be viable on the web as someone would have to pay for the hardware to run the proxy on, and the proxy's nature mean there are security risks related to accessing the local network.

Solution

After 1,5 years of exploring and discussing, this issue finally has a path forward:

For full networking support, we'd also need the following:

Nice to haves:

Limitations of the approach

Limitations without the network proxy:

All of the above could be resolved by plugging in a network proxy.

Other Alternatives

adamziel commented 1 year ago

For posterity: I tried a custom Request_Transport that tunneled all traffic through browser's fetch() using the vrzno extension by @seanmorris and that worked well except for sites that didn't allow cross-origin requests – which is most sites.

Interestingly, I remember that WordPress Plugin Directory did not work in this setup. However, @dd32 pointed out that it exposes the correct access-control headers:

curl -is ‘https://api.wordpress.org/plugins/info/1.2/?action=query_plugins’ | grep ‘^access-control’
access-control-allow-origin: *

So perhaps there is a way to support at least the api.wordpress.org requests with the browser's native fetch()? Let's revisit this idea.

adamziel commented 1 year ago

Networking is supported in the Node.js build as of https://github.com/WordPress/wordpress-playground/pull/119 – PHP sends data through a WebSocket to a local TCP proxy that handles the required network calls.

I can think of three ways to implement in-browser support:

adamziel commented 1 year ago

Also linking to this related discussion.

adamziel commented 1 year ago

Libraries like Composer require HTTPS and they verify the peer certificate by default: https://github.com/composer/composer/blob/11879ea737978fabb8127616e703e571ff71b184/src/Composer/Util/StreamContextFactory.php#L183-L197

As a workaround, networking in the browser could:

This will only work for endpoints exposing proper CORS headers, but it's a start.

dmsnell commented 1 year ago

Give PHP a fake wildcard CA cert

why not use a real chain of trust? I'm very leery of building a system whose default is to strip away all security from TLS connections and present trust for everything.

particularly if we're trying to make it easy to instantly spool up systems with a blueprint, this could so easily lead to cross-site attacks: "Hey look at the plugin I wrote: [malware link]"

for what it's worth, the default Erlang net library sets verify_peer to false and it's a disaster because nobody remembers to activate it and supply proper certs.

maybe I'm misreading this, but I'd rather us avoid that mistake if it's what I think we're talking about

adamziel commented 1 year ago

why not use a real chain of trust?

We do in Node.js. Browsers can’t open raw TCP sockets so we need to re-issue the request using fetch(). The only way to do it is to MITM the PHP program to parse the encrypted request data.

adamziel commented 1 year ago

Hosting a websocket proxy on e.g. free CloudFlare tier could solve this for now.

eliot-akira commented 1 year ago

Hosting a websocket proxy

Possible candidates:


EDIT: Oh, I see there's already something like this implemented in @php-wasm/node, based on maximegris/node-websockify.

https://github.com/WordPress/wordpress-playground/blob/trunk/packages/php-wasm/node/src/lib/networking/outbound-ws-to-tcp-proxy.ts

adamziel commented 1 year ago

Oh, I see there's already something like this implemented in @php-wasm/node, based on maximegris/node-websockify.

Yup, it is used in the @php-wasm/cli, VS Code extension, and wp-now. The same proxy would just work with the web version if it was hosted somewhere. The custom parts were added to support setsockopt().

fritexvz commented 1 year ago

I wonder what could be achieved, if so, by using the Cloudflare TCP Sockets and running WP Playground on Cloudflare Worker / WASM / NodeJS?

geekodour commented 1 year ago

Just to add more context to @fritexvz 's reply, running the playground on wordpress has been discussed here: https://github.com/WordPress/wordpress-playground/issues/69

adamziel commented 1 year ago

https://github.com/WordPress/wordpress-playground/pull/732 solves the bulk of the problem with issuing HTTP requests from WordPress. For full network support, we'll need to run a WebSockets proxy on the server.

aehlke commented 1 year ago

not urgent -What sort of use case would require the websocket support?

adamziel commented 1 year ago

@aehlke libcurl support. Curl is used e.g. by the Friends plugin by @akirk and by Composer to download and validate the HTTPS certificate.

adamziel commented 8 months ago

https://github.com/WordPress/wordpress-playground/pull/1051 implements a HTTPS termination function. All PHP-initiated network traffic is intercepted by a "fake WebSocket" instance which then offers a self-signed HTTPS certificate and reads the raw HTTP traffic, rewrites it as a fetch() call, and streams the response back to PHP. Note this may only work for HTTP and HTTPS requests to URLs exposing valid CORS-headers. It won't work for arbitrary sockets.

That PR needs a lot of cleaning up, but the concept seems to be solid. It would unblock support for libcurl and stream wrappers like file_get_contents("https://...").

adamziel commented 8 months ago

It took 1,5 years but we now have a clear path to resolving this issue 🎉

This would enable requesting all CORS-enabled HTTPS endpoints.

For full networking support, we'd also need the following:

The proxy wouldn't be hosted on Playground.wordpress.net as it would be a resource drain, but we could make spinning your own proxy instance easy enough.

Nice to haves:

jeffpaul commented 2 months ago

@adamziel would love to chat about this at WCUS Contributor Day if you'll be around?

adamziel commented 2 months ago

Hey @jeffpaul! Unfortunately I won't be around at WCUS :( But let me loop in @dmsnell who I know will be there. Alternatively, we could connect on .org slack or zoom.

adamziel commented 1 month ago

I merged this significant milestone earlier today:

Next up:

adamziel commented 1 month ago

Curl is available in web browsers since https://github.com/WordPress/wordpress-playground/pull/1935. fetch() is used as a network transport so the typical CORS limitations apply.

To solve, say, ~80% of the problem, we'd need to open up the CORS Proxy beyond talking to git. This is coming in the short to medium term.

To solve 100% of the problem, we'd need to tunnel the raw TCP traffic coming from Playground over a persistent WebSocket connection. In this scenario, we'd need a https://playground.wordpress.net/tcp-over-ws.php endpoint that would use stream_select to ingest data form Playground, pipe it to the network, and pipe the response bytes back to Playground. Definitely possible, especially with AsyncHttp\Client, but it's also non-trivial and I'm not sure what kind of appetite y'all have for such a feature. For now I'm taking a wild guess this is a very low priority project. If this is something that would help you, please comment on this issue and describe your use-case – if enough people come in, I'm happy to make it happen.

For now, here's what we need to close this issue: