mkdocs / mkdocs-redirects

Open source plugin for Mkdocs page redirects
MIT License
176 stars 25 forks source link

Allow to use regular expressions for redirects #29

Open djuarezg opened 3 years ago

djuarezg commented 3 years ago

Right now it only allows a 1-1 correspondance. Sometimes when migrating documentation you may find yourself that externally referenced links may be broken.

If this plugin allows you to select regular expressions, it removes the need of any external redirection, such as one you could do by yourself with nginx.

Andre601 commented 2 years ago

What I personally would like is support for wildcard patterns, to allow redirects of entire sections, similar to what Cloudflare allows in their redirect rules.

The common use-case is when you renamed an entire section from one name to another, resulting in X files having a new location. You would now need to set like X links, or make a redirect with the wildcard that would allow to change all pages with a single redirect.

Example-setup:

plugins:
- redirects:
    redirect_maps:
      'old/*': 'new/$1' # The $1 would be the first wildcard in the path

The above would now allow old/page1 to redirect to new/page1, old/page2 redirect to new/page2 and so on.

If such a feature is already available should it be mentioned in the readme.

csantanapr commented 2 years ago

We use this plugin for our website https://knative.dev and would like to see this feature

From one of our contributors:

I’m working on a PR to move all code samples into one directory. Because of this, the URLs will change from ./eventing/samples/... to ./samples/eventing/... . Is there a way to do a redirect for all pages in the directory using one line — instead doing them individually? On other projects I’ve done this with regex, eg: eventing/samples/(.*): samples/eventing/$1

JMuff22 commented 1 year ago

Also be interested in this.

glenn-jocher commented 10 months ago

+1 for our docs at https://docs.ultralytics.com

oprypin commented 10 months ago

This is impossible to implement. MkDocs just writes a bunch of HTML files, that's it.

In order to redirect 'old/*': 'new/$1', an infinite number of pages need to be generated - pages named old/a.html, old/b.html, old/c.html, ..., old/zzzzzz[...]zzzz.html.


Although if we think about it in the opposite direction-- 'samples/eventing/$1': 'eventing/samples/(.*)' then maybe there's something to it.

For every page eventing/samples/* that exists currently create a redirect-from page samples/eventing/$1

Andre601 commented 10 months ago

What about use of js? I think this could work for dynamic redirects?

thesuperzapper commented 10 months ago

Couldn't we use a script on the 404.html template, to perform the redirect using JavaScript?

EDIT: to prevent this impacting SEO, we might need to ensure the HTTP status does not return 404 for pages which we have a redirect for.

EDIT 2: this will clearly only work for paths that do not have corresponding markdown pages (because those paths would never hit 404), but we can just warn people in the docs about that.

Andre601 commented 10 months ago

Couldn't we use a script on the 404.html template, to perform the redirect using JavaScript?

Issue I see here is hosts that don't use the 404 page. An example is codeberg pages. Their system would serve their own 404 instead of the 404.html of mkdocs.

thesuperzapper commented 10 months ago

Yeah, I was just investigating on the GitHub actions side (which is probably the vast majority of public MkDocs websites), and it works by just serving the 404.html page for any unknown path (and will always show a 404 HTTP status for it).

I am not clear if client-side redirects on 404 pages are respected in the same way as 30X codes (for Google's SEO purposes), but it's probably the best option we have to allow complex, regex-based redirects.

thesuperzapper commented 10 months ago

Looking into it more, it seems like even the existing mkdocs-redirects are (from Google's perspective) returning the wrong status code for a redirect, specifically a 200 (success) code.

Currently, we are avoiding the "duplicate content" issue by setting the <link rel="canonical" href="../target-of-redirect/">. It seems like Google does allow injecting canonical tags dynamically with JavaScript, but I slightly worry about the impact of setting a canonical tag on a page which returns 404 status.

EDIT: I have found this reference page from Google about how it treats various kinds of redirects, it says that Google will treat javascript location redirects as if they were 301 status. So I think we should be ok to generate a 404.html that uses JavaScript to do client-side regex redirects as long as we:

  1. Dynamically generate the canonical meta tag which matches our redirect target
  2. Redirect by replacing the location using JavaScript