godotengine / godot-docs

Godot Engine official documentation
https://docs.godotengine.org
Other
3.89k stars 3.18k forks source link

Auto-redirect to latest docs version doesn't work for some legacy branch tutorials which were renamed or removed #5591

Open tavurth opened 2 years ago

tavurth commented 2 years ago

Your Godot version: 3.4

Issue description: When accessing links via google, the page is often outdated, giving a link to the newer version. This is OK but then when clicking the link the browser is not redirected to the correct page:

Screenshot 2022-02-09 at 10 03 26 Screenshot 2022-02-09 at 10 03 39

URL to the documentation page: https://docs.godotengine.org/en/2.1/learning/features/lighting/shadow_mapping.html

Should lead to

https://docs.godotengine.org/en/3.4/tutorials/3d/lights_and_shadows.html#shadow-mapping

Calinou commented 2 years ago

@mhilbrunner Are we still missing some redirects? I haven't been following this closely.

mhilbrunner commented 2 years ago

@Calinou I'll investigate. In theory, all/99% of old redirects should still work correctly as they were ported over. This may either be an edge case or something is wrong. I will look into it :)

tavurth commented 2 years ago

Since the page has a correct 404 return status, we could also run a script on every page to check if it has a redirect.

If it does, fetch that redirect and return a list of all pages which don't give a 200 status response.

I see that the redirect is loaded after the page, which means it's not likely server side rendered, but rather a javascript runtime. In this case using a JS compatible scraper (either selenium or one of the headless browsers from this list)

tavurth commented 2 years ago

I was thinking about the easiest way to perform the above crawl but when fetching https://docs.godotengine.org/sitemap.xml I see also that the sitemap is missing a lot of links.

The links which are there also do not have an associated sitemap i.e https://docs.godotengine.org/en/3.4/sitemap.xml is 404.

Perhaps by filling out the sitemap more fully we could allow google to better index the docs? Maybe I'm missing something and you have a more complete sitemap somewhere else that google is setup to index?

YuriSizov commented 2 years ago

Google indexes the docs fine enough. The problem is that the outdated docs are not outdated for the version they are indexed for. We can't really tell Google to completely forget about them, because they are still valid for those who look for those specific versions. Sitemap tries to indicate which sub-pages should be considered with a priority, but there is only so much we can tell it for SEO.

Current problems are mostly because we did a big migration of the docs and are trying to fill the gaps with redirects, something RTD is not well suited for. Like @mhilbrunner mentioned, there is probably just a rogue link that was missed when we were setting up redirects.

tavurth commented 2 years ago

I found a couple of other pages recently, mostly related to shaders, I will update here when I find them. It's why I suggest a more thorough approach with a script

tavurth commented 2 years ago

Here's another one:

https://docs.godotengine.org/en/3.0/tutorials/3d/3d_performance_and_limitations.html

Calinou commented 2 years ago

Here's another one:

docs.godotengine.org/en/3.0/tutorials/3d/3d_performance_and_limitations.html

Indeed, it's replaced by the 3D rendering limitations page now: https://github.com/godotengine/godot-docs/pull/3482

tavurth commented 2 years ago

@Calinou in master yes, but the link to 3.4.x docs seems to be broken

Screenshot 2022-03-08 at 20 32 16
akien-mga commented 2 years ago

We've added redirects for pages that were moved between 3.2/3.3 and 3.4/4.0, but not for much older versions. Some of those tutorials have been completely removed or rewritten from scratch and there's not necessarily a meaningful redirect to add. https://docs.godotengine.org/en/2.1/learning/features/lighting/shadow_mapping.html is from 2.1, it's even before the 3.0 refactoring of the docs which also changed most URLs.

We can try to add redirects for every single legacy version but that's going to be a significant amount of work to figure out all the changes between e.g. 2.1 and current 3.4.

Additionally, this "Note" is something added automatically by Read the Docs on all legacy branches. It's not added manually by us with a broken URL, it's auto-generated. We can disable it though if it's more harm than good.

tavurth commented 2 years ago

I think this notice is in general useful, as I want to get a quick redirect, however Google still so often links the out of date versions, it breaks for me maybe 50% of the time.

It's not added manually by us with a broken URL, it's auto-generated. We can disable it though if it's more harm than good.

Perhaps the JS script could perform a prefetch to get the status of the page it's attempting to link to.

In that way if the page returns 404 we could show a message something like:

"This version of the docs you are viewing is outdated, and unfortunately there is no matching page for the newer documentation, however here are the search results for the stable branch which match this page"