bobbingwide / dupes

Cater for duplicate slugs in WordPress content
GNU General Public License v2.0
0 stars 0 forks source link

Background: Refer to the Requirements for permalink /%postname%/ blog post and page on herbmiller.me #3

Open bobbingwide opened 2 years ago

bobbingwide commented 2 years ago

Anyone who's been pointed to this repository should be aware of my documentation of the Requirements for permalink /%postname%/ which you'll find documented at https://herbmiller.me/requirements-for-permalink-postname/.

Clicking on the above link will take you to the page which has a duplicate post name of the post that actually contained the Requirements (and other documentation).

This is an example of the original problem as stated in the WordPress TRAC #13459.

Further investigations revealed that attempting to access the post by its post ID also fails when there's a duplicate.

https://herbmiller.me/?p=21123

The tests that have been developed for #1 and #2 fail when they test these two scenarios.

Other tests, for other failing scenarios, such as the Scenario for attachments and for different post statuses for the duplicate content, need to be developed.

There should also be many tests that work; both now and when the fixes for 13459 have been developed.

bobbingwide commented 2 years ago

I'm still trying to understand the logic associated with $use_verbose_page_rules.

Otto ( Samuel Wood ) wrote about it on his blog http://ottopress.com/2011/how-the-postname-permalinks-in-wordpress-3-3-work/ He explains that prior to WordPress 3.3 each individual page had its own entry in the rewrite rules, but there was a performance issue that needed fixing.

The fix introduced the get_page_by_path() call, which takes the given URL and attempts to match it to a page or attachment.

In an earlier post - http://ottopress.com/2010/category-in-permalinks-considered-harmful/ - Otto mentions a couple of bad permalink structures %category%/%postname% and /%postname%/.

He then explains that, the way WordPress worked in those days ( 2.5 to 3.2 ), using permalink structures that would match any character meant they could trump the request to access a page.

So the use_verbose_page_rules logic was added. And this caused a major performance problem for sites with hundreds of pages.

There's nothing about accessing duplicate posts. I believe this is because the previous changes were to enable pages to be accessed rather than posts. This happened 11 years ago when pages, Custom Post Types and Custom Taxonomies were fairly new.

bobbingwide commented 2 years ago

There's nothing about accessing duplicate posts.

Well, not in the blog post itself. But there are questions about duplicate slugs in the comments. The replies to these questions included the following assertions.


2011/12/30

The rules are varied.

Best to just not use all-numeric slugs.


2012/05/03

If you only use the %postname% as the custom slug, then the code we added into 3.3 takes care of this. See, the page_rewrite rules are in front of the post_rewrite rules. So it will first match as a Page because the URL pattern matches. So any URL with just the top-level will first match against Pages and have pagename set. The code in rewrite.php then uses the get_page_by_path function to actually check to see if that is a real Page. If the Page exists, then pagename is set and the query happens as normal. If the Page does not exist, then the rewrite match is thrown away and it continues on with the normal Post rewrite rules.