Brain-WP / Cortex

Routing system for WordPress
MIT License
347 stars 20 forks source link

Routes "players" and "players/" are treated as same url #27

Closed dima-stefantsov closed 1 year ago

dima-stefantsov commented 3 years ago

Hi. I'm using "brain/cortex": "*@dev" == "version": "dev-refactoring-fastroute" Works fine so far, except I've found it treats route "/players/" exactly the same as "/players", meaning my application will have duplicate urls which is bad for SEO and possibly for static caching.

I've found that in your code you are using trim(path, /). I've also found that unlike your example on frontpage

$routes->addRoute(new QueryRoute(
        '^post/latest$',

route doesn't support regex, just plain text. Please correct me if I'm wrong, that could solve my issue.

I've tried to edit few pieces of your code, replaced trims with ltrim, with some relative success. I don't like editing /vendor/ anyway, and for you it could be a breaking change.

I've also found that I could manually do it on mod_rewrite level, like

# have to redirect just my custom urls, must not redirect wordpress urls, permalinks do end with /
RewriteEngine on
RewriteCond %{REQUEST_URI} ^/test.*/$ [OR]
RewriteCond %{REQUEST_URI} ^/player/.*/$ [OR]
RewriteCond %{REQUEST_URI} ^/players/$ [OR]
RewriteRule (.*)/ $1 [R,L]

that's my way to go at the moment, but could you please provide a way to do it properly, with code, not with hacks? My common sense tells me it must be easy to just force exact match, why is your framework treats technically different urls like it's the same without a way to distinguish.

Thanks.

P.S.: also I'm never using and not planning to use routes system to query WP items, just building a system with custom code and custom mysql queries, which runs alongside wordpress. Using ActionRoute() and just failed to add RedirectRoute because of issue described above, 'players' and 'players/' are treated as same route. Only wordpress thing I'm using is it's templating, rendering my output between get_header() get_footer(). Maybe using nikic/fast-route directly is the way to go for me?

dima-stefantsov commented 3 years ago

How I'm using it right now:

$routes = [
    [
        'route' => 'test/{variable}',
        'controller' => 'dsr_controller_test',
        'template' => 'test/index',
    ],
    [
        'route' => 'player/{player_id}',
        'controller' => 'dsr_controller_player',
        'template' => 'player/index',
    ],
    [
        'route' => 'players',
        'controller' => 'dsr_controller_players',
        'template' => 'players/index',
    ],
];

foreach ($routes as $route) {
    $cortex_routes->addRoute(new ActionRoute(
        $route['route'],
        function(array $matches, \WP $wp) use ($route) {
            $wp->query_vars = [];
            $route['controller']($matches);
            return false;
        },
        ['template' => DSR_PLUGIN_DIR.'/views/'.$route['template'].'.php']
    ));
}
gmazzap commented 3 years ago

Hi @dima-stefantsov, yes Cortex removes trailing slashes at the end of the path before attempting to match the request.

Cortex is a package for WordPress, and in WordPress you can't have /players and /players/ as separate destinations.

What WordPress does is to calculate the "canonical URL" and then redirects to it. That's a quite slow process.

Calculating the canonical URL and put it in the <link rel="canonical"> in the HTML (see https://moz.com/learn/seo/canonicalization) is totally fine by itself. For example, for SEO purposes you'll be very fine with just the canonical link without the need of also doing the redirect.

Note that having the canonical URL in is anyway a good idea, that for example prevents that /players and /players?x (where "x" is any query string) are considered duplicated content.

If you're implementing full-page cache using e.g. Varnish, you're correct that you'll end up in two copies of the same page, but that will happen anyway. In fact, if you'd redirect from PHP, Varnish will cache both /players and /players/, but one of them would be a cached redirect response. Honestly, assuming canonical URL in HTML is implemented, I would let Varnish keep two copies of the page instead of a "regular" copy + a cached redirect, that will not make much difference, and thanks to canonical URL it will cause no other issues.

Implementing the redirect from mod rewrite is an alternative, if you really want it. Varnish will keep a cached content + a cached redirect. You don't need to add a new redirect rule for each route, you can enter the single rule:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*[^/])$ /$1/ [L,R=301]

to redirect anything without trailing slash to the same URL with trailing slash.

Not taking into account full page cache, which for example will not be there for logged-in users, mod rewrite redirect will be more performant not having to load PHP and bootstrap WordPress at all. Actually, having to load the bulk of WordPress files to do a redirect makes no sense anyway: redirect from PHP should only be used if the page to redirect to has to be calculated via PHP. For someting as predictable as forcing the trailing slash, redirect at webserver level should be the preferred choice in any case (regardless Cortex or not).