readthedocs / readthedocs.org

The source code that powers readthedocs.org
https://readthedocs.org/
MIT License
8.06k stars 3.59k forks source link

Ability to have canonical root domain without redirect. #8250

Open Daltz333 opened 3 years ago

Daltz333 commented 3 years ago

Details

As of the deployment of #3231, the "Canonical" option under "Domains" no longer just sets the canonical meta data in non-root webpages, but also redirects the other pages to the "Root".

This comment from #3231 helps detail our use-case

It's definitely the case that canonical=True was poorly named or poorly applied when we originally wrote the implementation. Ideally canonical=True would function the way you're describing. From our current perspective, canonical=True only affects how the resolver determines the domain for the project and pieces like search and auth on the commercial side -- its effectively primary=True, not canonical=True.

We have a "primary" domain that we would prefer all search engines and SEO to use. This is https://docs.wpilib.org/. However, we also have a secondary domain https://frcdocs.wpi.edu/ that we direct users toward when the domain "https://docs.wpilib.org/" is blocked. It's often just domain blocked by over-aggressive school internet (this is our target demographic as we write docs for a high-school robotics competition).

Currently, https://frcdocs.wpi.edu/ redirects to https://docs.wpilib.org/. We'd like the ability to distinguish between "primary" domains and "have all sites redirect here" domains.

stsewd commented 3 years ago

Hmm, I think if you uncheck the canonical option from the other domain the page will be served from that domain (not sure), if that works you can manually set the canonical meta tags with this option https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-html_baseurl

Daltz333 commented 3 years ago

I have unchecked the option, but it still redirects. It may take a bit for things to resolve (guessing). If it works, I'll follow that route to set the canonical meta tags.

stsewd commented 3 years ago

Looks like now it redirects to the rtd.io domain :D

$ curl -I https://docs.wpilib.org                                                        
HTTP/2 302 
date: Wed, 09 Jun 2021 15:59:52 GMT
content-type: text/html; charset=utf-8
content-length: 0
location: https://frc-docs.readthedocs.io/en/stable/
Daltz333 commented 3 years ago

Redirecting to the RTD domain isn't the correct behavior either. We want all the domains to be equally accessible, but one domain "preferred"

Daltz333 commented 3 years ago

Hm. It looks like multiple domains is completely broken. This should probably get changed to "bug". It's either all domains redirect to readthedocs, or redirect to the canonical root.

stsewd commented 3 years ago

It's either all domains redirect to readthedocs, or redirect to the canonical root.

Yeah, that's the current behavior, it was implemented in that way.

Daltz333 commented 3 years ago

What is the point in adding multiple domains besides redirecting?

stsewd commented 3 years ago

I agree this shouldn't be the expected result, that's why I labeled this as design decision, I think this is mostly related to our resolver code always pointing the one domain.

Daltz333 commented 3 years ago

Gotcha. Thanks.