WordPress / wordpress-playground

Run WordPress in the browser via WebAssembly PHP
https://w.org/playground/
GNU General Public License v2.0
1.64k stars 257 forks source link

Server migration – convert .htaccess to Nginx conf #1197

Closed adamziel closed 5 months ago

adamziel commented 7 months ago

playground.wordpress.net is shared hosting and sometimes becomes unusably slow. There's a new server @brandonpayton set up, but it runs Nginx and not Apache.

We need to express the below rules using Nginx. These could be useful:

AddEncoding x-gzip .gz

<FilesMatch "index\.html">
Header unset ETag
Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
</FilesMatch>
<FilesMatch "index\.js|blueprint-schema\.json|logger.php|wp-cli.phar|wordpress-importer.zip">
Header set Access-Control-Allow-Origin "*"
Header unset ETag
Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
</FilesMatch>

SetEnv ENV_VARIABLE ****

AddType application/wasm .wasm
AddType application/octet-stream .data

<FilesMatch "iframe-worker.html$">
  Header set Origin-Agent-Cluster: ?1
</FilesMatch>
<FilesMatch "store.zip$">
  SetEnv no-gzip 1
  SetEnv no-brotli 1
  Header set Access-Control-Allow-Origin: *
</FilesMatch>

RewriteEngine on
RewriteRule ^scope:.*?/(.*)$ $1 [NC]
RewriteRule plugin-proxy$ /plugin-proxy.php [NC]
RedirectMatch 301 /wordpress-browser.html /

RewriteCond %{HTTP_REFERER} ^https://developer/\.wordpress\.org/
RewriteRule wordpress.html /index.html [R=302,L]

RewriteCond %{HTTP_REFERER} ^https://wordpress/\.org/
RewriteRule wordpress.html /index.html [R=302,L]
brandonpayton commented 7 months ago

With the new site, we don't have direct access to nginx config, but we may be able to get there with a combination of

The site is currently set up as a non-WP site, but we could create a different WP-based site if it turns out we'd like to use other features like a page cache based on memcached.

bgrgicak commented 7 months ago

There is also one .htaccess in the root folder on the server which would need to be migrated.

brandonpayton commented 6 months ago

So far, it looks like we will need to use PHP to handle requests so to dynamically handle different URI and to add custom HTTP headers. Using .htaccess is certainly more convenient, but it is not supported. PHP will be slower than directly serving static files, but we have edge caching available that should speed things up.

It's easy to turn on edge caching for a site, but I need to look into how we can automate cache invalidation when deploying Playground updates.

For now, I'm working on handling the different requests with PHP. Once that is working, we can worry about edge caching.

brandonpayton commented 6 months ago

Serving Playground from the test site seems to be working OK: https://https://playground-dot-wordpress-dot-net.atomicsites.blog/

Because we want to customize HTTP headers and perform redirects and do not have the ability to customize nginx config, all files including static files are served via PHP using a platform feature custom-redirects.php. https://gist.github.com/brandonpayton/9e3da0845791f6e5c833013d83cf86d6#file-custom-redirects-php

It's super verbose compared to htaccess, but it works and is surprisingly fast IMO given that PHP is also serving every static asset -- at least when it isn't under heavy load.

Before we move this to production, we can enable edge caching so those responses can be served from the cache.

What is left:

bgrgicak commented 6 months ago

// TODO: Set these /* SetEnv GITHUB_TOKEN --secret--

Does this need to be set in PHP or is there a more secure way? I know that VIP allows admins to set env variables.

brandonpayton commented 6 months ago

Does this need to be set in PHP or is there a more secure way? I know that VIP allows admins to set env variables.

@bgrgicak, on WP Cloud, there is are APIs for setting and accessing secrets, so I was thinking we'd use those.

Btw, the comments at the end of custom-redirects.php are just parts of the htaccess file that haven't been addressed yet. Maybe that was clear, but I feel like explaining since they just look like cruft at the end :)

adamziel commented 6 months ago

@brandonpayton let's start a new issue to create and document a Playground serving flow that uses just PHP dev server or... Playground itself :-) it's not a high priority work, but let's still track it. It's cool how we can handle all the rewrite rules headers etc. using just PHP. I'm already thinking through scenarios where it makes everything self-contained and removes even a webserver dependency.

brandonpayton commented 6 months ago

Some updates:

For the 429s, we might need to request a greater allowance for these requests, which will hopefully be OK once edge caching is enabled. Also, we might consider carefully diffing files between website builds and requesting cache invalidation only for changed files.

brandonpayton commented 6 months ago

@adamziel, sure! Created #1294. Please feel free to add more context if I missed anything. 🙇

brandonpayton commented 6 months ago

I am currently looking into how to avoid the 429s and to get the edge cache to work for requests for Chrome. It is priming the cache for 2+ request for the same resource via curl, but when requesting in Chrome, the dev tools Network tab just shows cache misses. It's later in the day, so perhaps what is going on will be obvious to fresh eyes in the morning.

brandonpayton commented 6 months ago

It is priming the cache for 2+ request for the same resource via curl, but when requesting in Chrome, the dev tools Network tab just shows cache misses.

I think the hideExperimentalNotice cookie is probably causing edge cache misses. Will look at what we can do for this.

brandonpayton commented 6 months ago

I think the hideExperimentalNotice cookie is probably causing edge cache misses. Will look at what we can do for this.

Safari does not allow using WebStorage in private windows, so if we want Playground to run in that context (and I would guess we want it to run in as many contexts as reasonable), we cannot rely upon localStorage.

One possibility is to augment the fetch request for static files to omit credentials.

adamziel commented 6 months ago

One possibility is to augment the fetch request for static files to omit credentials.

Sounds like a great idea

brandonpayton commented 6 months ago

One possibility is to augment the fetch request for static files to omit credentials.

Sounds like a great idea

One downside is that this may cause issues for anyone wanting to host a Playground behind some kind of auth requirement. If we need to do it, it seems fine for now, and we could later add configurability so creds can still be relayed by Service Worker fetch.

That said, I've been thinking, and we may be able to have nginx serve most of our static files directly. We don't really have special rules for most files, so I think we may be able to deploy most static files in a way that nginx can find them directly without involving PHP. And when we do want to add custom headers we can omit those files from their usual location and have PHP serve them with custom headers instead.

@adamziel, is there a reason we cannot rewrite "scope:xyz/" paths in the Service Worker so that doesn't need handled on the web server? If we can do that, I think we can cut PHP out of the loop on the web server for most static file requests, and that should address the 429 errors we are seeing due to rate limiting (which can be due to using too many PHP workers at a time).

adamziel commented 6 months ago

One downside is that this may cause issues for anyone wanting to host a Playground behind some kind of auth requirement. If we need to do it, it seems fine for now, and we could later add configurability so creds can still be relayed by Service Worker fetch.

Good point, let's open a new issue to keep track of this. Someone will report it sooner or later, let's prioritize then.

we may be able to deploy most static files in a way that nginx can find them directly without involving PHP

That sounds good, too!

@adamziel, is there a reason we cannot rewrite "scope:xyz/" paths in the Service Worker so that doesn't need handled on the web server?

I thought we did that already? Do requests with scope:xyz in them make it to the server?

brandonpayton commented 6 months ago

Good point, let's open a new issue to keep track of this. Someone will report it sooner or later, let's prioritize then.

I created a PR to omit credentials from the Service Worker requests for static files here. When that is merged, I will create an issue to acknowledge credentials are omitted in case anyone has a problem with that.

@adamziel, is there a reason we cannot rewrite "scope:xyz/" paths in the Service Worker so that doesn't need handled on the web server?

I thought we did that already? Do requests with scope:xyz in them make it to the server?

Interesting! The Chrome dev tools show that as the effective URL, and even in the request's "Initiator" sub-tab, it looked like the final URL included the scope:xyz. But when I log requests on the server, all I see for scope mentions are HTTP Referer headers like:

'HTTP_REFERER' => 'https://playground-dot-wordpress-dot-net.atomicsites.blog/scope:0.8919731437847068/',

we may be able to deploy most static files in a way that nginx can find them directly without involving PHP

That sounds good, too!

Great! My plan is to:

We can do all of this in a working dir outside of the web root and then rsync everything into place under /srv/htdocs/.

🤞 I think this will take care of the 429 errors, and it should be faster regardless.

brandonpayton commented 6 months ago

I have scripts written to do the above mentioned setup on the host. It is mostly working, but there is a confusing bug where PHP claims that str_ends_with() does not exist even though PHP_VERSION reflects 8.1.

Planning to continue this work in the morning.

brandonpayton commented 6 months ago

After merging #1331 and the fix #1333, Nginx is now serving most Playground files directly without involving PHP. The 429s have disappeared, and I am seeing more edge cache HITs.

The last thing AFAIK is to get the logger working. This should be straightforward. The platform has an API for setting and retrieving secrets, and we can use the same custom-redirects-lib.php file to notice requests to logger.php and set secret-related environment vars ahead of time (so logger.php doesn't need to change). I am planning to finish that in the morning.

To see the current state of the site, check out: https://playground-dot-wordpress-dot-net.atomicsites.blog/

brandonpayton commented 6 months ago

1337 wired up the error logger on the new WP Cloud site.

A few things left:

The only possible MUST before switching the playground.wordpress.net domain over to WP Cloud is the wordpress.html redirect. We probably also want to do more thorough testing before making the switch.

adamziel commented 6 months ago

Redirects to / for /wordpress.html requests with wordpress.org "Referer" are not working properly. This was derived from our existing htaccess file. @adamziel is this still needed?

@brandonpayton we’d have to patch and re-deploy this line (and potentially another one in the same repo):

https://github.com/WordPress/wporg-wasm/blob/ce5c76146eb46578a8a80239cfb2f05cd7ac7dfe/source/wp-content/themes/wporg-wasm/src/wasm-demo/src/components/playground.js#L14

brandonpayton commented 6 months ago

@brandonpayton we’d have to patch and re-deploy this line (and potentially another one in the same repo):

@adamziel, I think we should be able to fix the redirect on our side as well. It's just broken at the moment. I can take a quick look at fixing the redirect, and if that doesn't work, we can adjust the behavior under wporg-wasm.

adamziel commented 6 months ago

@brandonpayton cool! FYI that code powers the Playground demo embedded on https://w.org/playground.

brandonpayton commented 6 months ago

@adamziel, thank you for the tip about how the redirect is used by https://w.org/playground.

I fixed the redirect to work. There is a secondary bug where the redirect may not happen when edge-caching has already cached a non-redirecting /wordpress.html response, and that is being fixed under #1351.

It looks like https://w.org/playground is embedding an iframe referring to wasm.wordpress.net. Is there any difference between wasm.wordpress.net and playground.wordpress.net?

adamziel commented 6 months ago

Is there any difference between wasm.wordpress.net and playground.wordpress.net?

both point to the same server, that redirect is the only difference

brandonpayton commented 6 months ago

Here is what I believe remains before switching to the new site:

@WordPress/playground-maintainers any other thoughts on this?

brandonpayton commented 6 months ago

Deduplication of MIME type mappings is done. I also discovered oauth.php used secrets and made an update for that.

What is left is:

brandonpayton commented 6 months ago

@adamziel, I updated the secrets but am running into an issue testing GitHub Export with https://playground-dot-wordpress-dot-net.atomicsites.blog.

Trying to Connect to GitHub redirects to the following URL: https://playground.wordpress.net/?error=redirect_uri_mismatch&error_description=The+redirect_uri+MUST+match+the+registered+callback+URL+for+this+application.&error_uri=https%3A%2F%2Fdocs.github.com%2Fapps%2Fmanaging-oauth-apps%2Ftroubleshooting-authorization-request-errors%2F%23redirect-uri-mismatch

Is it possible to update the app settings to allow redirecting to the test site domain?

brandonpayton commented 5 months ago

We migrated the website to WP Cloud today and this can be closed.