WordPress / wordpress-playground

Run WordPress in the browser via WebAssembly PHP
https://w.org/playground/
GNU General Public License v2.0
1.64k stars 259 forks source link

Rewrite URLs in imported WXR files to avoid broken navigation links (white screen, errors, nested Playground) #1780

Open bph opened 1 month ago

bph commented 1 month ago

On this Playground site. I get intermittent success when using the navigation menu. One-page load works, subsequent page loads show a white screen. The content is all working when I got to WP-admin > Pages and use the View of each page. But sometimes on link works, but then the next one doesn't.

The content and blueprint can be viewed in this repo.

Here is a video of my clicking around on the site.

https://github.com/user-attachments/assets/07cc8bf7-fdb2-41e6-ad0b-a7e58476810a

bph commented 1 month ago

When I navigate around the site not using the links in the navigation block, the pages load consistently. Archive page, author page, single posts etc. when I click again on one of the links in the navigation bar, I get a white screen once more.

https://github.com/user-attachments/assets/bb08f670-f9d4-4258-8ec2-9bc94144e553

At the end of the video, you see me click on the Home link in the navigation, it actually loads another playground instance into the site.

Screenshot 2024-09-18 at 14 52 51

bgrgicak commented 1 month ago

I'm wondering if these two could be related #349

This seems like a caching bug to me. Playground is trying to load the page from cache while it should call PHP. Screenshot 2024-09-18 at 14 14 58

bph commented 1 month ago

Thank you @adamziel for setting me straight... Glad it was so easy to transfer.

bph commented 1 month ago

The default instance of playground uses a URL like https://playground.wordpress.net/scope:0.5198681762892301/?page_id=2 (TT4, Sample page in Header)

On my site it only has the URL https://playground.wordpress.net/about-us

Is there a way for me to modify the URL in the Navigation space of my .xml file from relative links <!-- wp:navigation-link {"label":"About Us","type":"page","description":"","id":28,"url":"/about-us/","kind":"post-type"} /--> To something like https://playground.wordpress.net/scope:{somestring}/about-us? The line of code is in the XML import, that I modified to remove the original site's absolute links to show only relative links.

bgrgicak commented 1 month ago

This is definitely related to scope.

After the first load, the page is /. When you click on a page like /patterns/ the referer (/) has scope so we avoid caching. When you click on /news/ the referer (/patterns/) doesn't have scope and it goes to cache.

I think that there is an underlying problem because / gets a scope when used as a referer, while /patterns/ doesn't.

bgrgicak commented 1 month ago

Is there a way for me to modify the URL in the Navigation space of my .xml file from relative links

To something like https://playground.wordpress.net/scope:{somestring}/about-us? The line of code is in the XML import, that I modified to remove the original site's absolute links to show only relative links.

Great research @bph! You are right about the root cause being imported URLs that aren't rewritten.

I'm not sure what's the best way to address this and will need to work with @adamziel and @brandonpayton on finding possible next steps.

bgrgicak commented 1 month ago

It looks like we are attempting to add the scope to the URL if it doesn't exist but that scope isn't used later by the browser or our code (I still don't know).

bgrgicak commented 1 month ago

I see a few directions here, but I'm not sure what to do.

bgrgicak commented 1 month ago

I'm moving this to blocked until I get some feedback from @WordPress/playground-maintainers.

bph commented 1 month ago

@bgrgicak thank you so much for pushing this forward.

This is actually also a problem when migrating sites to other servers, as absolute links need to have a search/replace function. If Playground can do it out of the box, there wouldn't be a need for me to modify the original site export file for images and links. And a two section of my tutorial could be cut could be cut. 🤔

Seems you have enough information to tackle this. Just want to mention that this is not only a hick-up in relation to the navigation block but happens with normal on page links, to be visible on the Templates page. Those also don't work. the string of the link is

<li>a <a href="/page-no-title/" data-type="page" data-id="192">page  no title template</a> that allows for a Hero image or a Cover block directly on the top of the page. </li>
<!-- /wp:list-item -->

Screenshot 2024-09-20 at 11 39 49

adamziel commented 1 month ago

@bph A proper resolution will take a few months. Is there a way you could ship that block without an absolute URL in the href=""? Maybe a relative one would work? Or maybe the block could handle only having a page ID?

Longer answer:

The imported WXR file contains this code:

<!-- wp:list-item -->
<li>a <a href="/page-no-title/" data-type="page" data-id="192">page  no title template</a> that allows for a Hero image or a Cover block directly on the top of the page. </li>
<!-- /wp:list-item -->

Which is not rewritten by the WXR importer we're currently using. I'm not aware of a tool that we could use in Playground that would also could correctly handle that today. I'm planning to fork/build a WXR importer and bake in the URL rewriting using the plumbing we've been exploring for the past year [1] [2]. Once it matures, I'll want to propose it for WordPress core.

[1] https://github.com/adamziel/site-transfer-protocol [2] https://github.com/adamziel/wxr-normalize/pull/1

bph commented 1 month ago

@adamziel thanks for looking into this again.

I am a bit confused as to what you see as absolute link and relative link

Maybe a relative one would work? Isn't <a href="/page-no-title/" a relative link? An absolute link would be something lie https://wordpress67.local/page-no-title

bph commented 1 month ago

So for the header navigation, the examples of how the theme Twenty-Twenty-Four works out of the box got me thinking.

If I added all the pages and be deliberate with the page parent selection, the theme default navigation probably will work with the page list, create the submenus and some voodoo that is built into it. (voodoo = not entirely clear, how it works)

So with the v2 blueprint and v2 content, I was able to get this part working. )

https://github.com/user-attachments/assets/b369d468-a4ef-4b7f-b5e1-4728cfe2d438

In the video you can see that all link from the top navigation have a scope assigned and load pages from a virtual (or how you want to call it) directory. It works because I didn't create a custom navigation block. The automatism built into WordPress takes care of it. but it seems Playground already rewrites links and adds scope to the URLs.

Next steps: Before the next upload