Closed shcheklein closed 4 years ago
separate file with a list of simple expressions...
And maybe add a note with the date the redirect was added (can also be checked with GitHub Blame but it's not always straightforward) for periodic cleaning up of this list.
Another option is to use DNS for this, not the app.
@jorgeorpinel I don't think those should be cleaned up unless they clash with something.
Another thing - do not forget to use HTTP code 301 for moved things, not 302.
Actually @Suor we are having some trouble with some 301 "permanent" redirects that we now want to change again... They're cached by the browser and can't be changed again by us until that cache expires a long time later. 302 Found may be better for us since we are not so sure they are so permanent. See https://github.com/iterative/dvc.org/issues/768#issuecomment-549456704
I think they are still 301s @jorgeorpinel . Even though we change the end location we don't change the fact of the redirect. So, unless we don't find a good solution for caching, I would keep it 301. But I haven't googled best practices around dealing with cache yet.
Ok, reading the link now in that comment :)
If no-cache
policy does not work probably we should indeed use 302s, but I would google about the consequences of this for search engines and whatnot.
Well, they both have an impact. 301 is better in general but we need to be sure the redirect is actually permanent and since we're in an active development phase... I think we should not use 301s as much. Many of our redirects are temporary, which is what 302 is for. As for SEO effects:
302 redirect informs search engines to continue indexing old pages. It may impact importantly your SEO results. When you are optimizing the new page URL, make certain to be vigilant about not having duplicate content on both old and new pages at a similar time. The temporary redirect is very useful when you bring your website visitors back to the old page after some time.
https://www.firstrankseoservices.com/blog/effects-of-301-and-302-redirect-on-seo/
Almost all our redirects are 301s in nature. We do not plan to bring your website visitors back to the old page after some time.
. The only thing that is potentially not permanent is the URL we redirect to (not the fact of redirect). And probably the right way to handle this actually is to keep old 301s and add new ones (making a chain of 301s effectively).
Either we do that, or we can try no-cache
as a hack.
Looks like 302s is not a good option in terms of SEO.
p.s we don't HAVE to use 302 Found
, there's also 303 See Other
, and 307 Temporary Redirect
.
https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#3xx_Redirection
Full list of current redirects in server.js (from https://github.com/iterative/dvc.org/issues/768#issuecomment-549494662) and their motivation:
Enforce https protocol and remove www from host
- SEOman.dvc.org/{cmd} -> dvc.org/doc/command-reference/{cmd}
- Route design
replace - for / in {cmd} except for /get-url, /import-url
- Backward compatibility{code/data/remote}.dvc.org -> corresponding S3 bucket
- Route designpath /(deb|rpm) -> corresponding S3 bucket
- Route designpath /(help|chat) -> Discord chat invite
- Route designpath /doc/commands-reference... -> /doc/command-reference...
- Backward compatibilitypath /doc/tutorial/... -> /doc/tutorials/deep/...
- Backward compatibilitypath /doc/tutorial -> /doc/tutorials
- Backward compatibilitypath /doc/use-cases/data-and-model-files-versioning -> /doc/use-cases/versioning-data-and-model-files
- Backward compatibility** - Backward compatibility (also removes trailing
/`)p.p.s. also there's a new 308 Permanent Redirect code in HTTP 1.1 to replace 301, and is now supposedly supported by most major browsers that have significant market share. https://tools.ietf.org/html/rfc7538#page-3
I would say we can use 308 for "structural" redirects - man.dvc.org to dvc.org/man - pages that were never indexed in the first place.
For real pages (like tutorial or something) we should probably keep using 301 to be safe, to properly transfer ranking from one page to another. And actually never change the redirect itself, introduce a new one if needed.
Again, this is from the top of my head - I haven't had time to google and confirm all these points. Point is we should think about SEO, about caching, etc carefully.
Another option is to use DNS for this, not the app.
We never discussed this option. Wouldn't it that be easier for several of the redirects (listed in https://github.com/iterative/dvc.org/issues/757#issuecomment-549611289 above).
@jorgeorpinel what DNS options do you specifically mean?
Hmm I guess DNS records won't help us much, now that I think about it. There are some DNS-based/related services e.g. http://redirect.center/ and http://redirect.name/ which could help us implement some basic redirects:
{code/data/remote}.dvc.org -> corresponding S3 bucket
But maybe not the ones based on app routes (URL paths). So maybe this is not worth it, just for 2 cases. Plus we may not want to rely on some external service.
Basically... Never mind about DNS options.
Another option to actual HTTP redirects is conditional rendering logic in the server side components combined with client-side shallow routing as explained in https://sergiodxa.com/articles/redirects-in-next-the-good-way/ We'd need to see how difficult the refactoring is, and make sure the code doesn't result confusing, but this has the advantage that no HTTP redirects would ever be issued. Maybe we would even stop needing a custom server script (which disables automatic static optimization).
HTTP redirects would ever be issued
I think this is not the goal. we should issue 301 for the proper search engines handling and when you do use tools like curl
to fetch the page. Client-side rendering alone is not enough.
OK, well, as Next.js doesn't yet feature HTTP redirects the only way to do it is with a custom server script like we already have (/server.js) (which disables automatic static optimization unfortunately).
What I think we can do to improve this is to extract all the redirect logic to a separate module that loads the regex and targets from a JSON in order, and tests them all. That way we only ever change the JSON file in the future, which is kind of configuration instead of code (it could even live outside the Git repo).
extract all the redirect logic to a separate module that loads the regex and targets from a JSON in order, and tests them all
yes! that was the idea of this ticket
OK got it, great. Actually though, yet another option I'd like to leave here for the record is to write and deploy a completely separate proxy server that exclusively does redirects, and that fetches our regular node app when there's no special redirect.
I'd still vote for the separate module at this point.
p.s we don't HAVE to use 302 Found, there's also 303 See Other, and 307 Temporary Redirect.
On this, I vote for 303 See Other as discussed in https://github.com/iterative/dvc.org/pull/875#pullrequestreview-341328648.
yep. 303 should be fine, not ideal as well but fine.
Every time we move a page to a new URL we need to create a proper redirect.
The only way to this now is to create a 4-5 lines entry in our custom
server.js
. This is def. not scalable. We need to find an easier way to handle this.The first solution that comes to my mind is a separate file with a list of simple expressions, may be converter functions.