facebook / docusaurus

Easy to maintain open source documentation websites.
https://docusaurus.io
MIT License
56.32k stars 8.45k forks source link

Conflicting canonical meta tag and hreflang meta tags #6011

Closed mem212 closed 2 years ago

mem212 commented 2 years ago

Have you read the Contributing Guidelines on issues?

Prerequisites

Description

Generating canonical and hreflang meta tags may be considered as a conflict in terms of SEO when the refer to the same url. This will make Google ignore the page. Please read about the subject : https://www.searchviu.com/en/hreflang-canonical/ . Docusaurus should provide a way to control what is generated into the head/meta-tags or provide the possibility to disable some.

Steps to reproduce

  1. configure a doc only website ( may be not necessary to reproduce ) .
  2. run the app
  3. check the meta tags

Expected behavior

Provide the possibility to disable the generation of some meta tags or have a control on what is generated.

Actual behavior

canonical and hreflang metatags are both generated and refer to the same url.

Your environment

Reproducible demo

No response

Self-service

Josh-Cena commented 2 years ago

From my understanding of canonical URLs we should remove hreflang entirely from translated pages and just keep them on the default locale

mem212 commented 2 years ago

My suggestion is to not implement/force a certain way of doing SEO but give the developer/user more flexibility. It is totally possible to add new meta tags in different ways. Examples :

However, maybe I'm missing something but there is no way to disable specific meta tags, while overriding is possible.

Josh-Cena commented 2 years ago

It's certainly possible (and not hard) to add metadata tags, but disabling existing ones is hard. I do think we should fix this on our side as we promised to have good SEO defaults

slorber commented 2 years ago

This is what Apple has in its metas:

   <head>
      <meta charset="utf-8" />
      <link rel="canonical" href="https://www.apple.com/fr/" />
      <link rel="alternate" href="https://www.apple.com/fr/" hreflang="fr-FR" />
      <link rel="alternate" href="https://www.apple.com/ge/" hreflang="en-GE" />
      <link rel="alternate" href="https://www.apple.com/gn/" hreflang="fr-GN" />

They are not alone, and I guess those sites know what they are doing šŸ˜…

I also studied this deeply when I implemented it, based on Google documentation.

Please show me some doc from a high authority level explaining that what we do is not a good practice. The linked resource "searchVIU" does not look like an authority in this domain to me, and could definitively be wrong.


Provide the possibility to disable the generation of some meta tags or have a control on what is generated.

Docusaurus is an opinionated tool trying to do the good thing by default without overwhelming users with a lot of config options.

If you have stronger opinions than we do, please tell us why and be ready to back your claims.

Otherwise you can still swizzle our theme components and replace the logic we use to set headers.

We have a <LayoutHead> component that you can swizzle to change the hreflang meta. Just be aware that you are overriding some Docusaurus internal code that might break when upgrading. We'll still allow you to override those metas, it may just be in a different place.

mem212 commented 2 years ago

If you have a one language website (ex: EN), then why to add alternatives ?

In my case I have no problem with the way you implemented it, my issue is with customization. Since my website is provided in one and only one language, I need only canonical and no alternate pages.

When it is recommended to add alternate pages ( hreflang ) Some example scenarios where indicating alternate pages is recommended: ā€¢ If you keep the main content in a single language and translate only the template, such as the navigation and footer. Pages that feature user-generated content, like forums, typically do this. ā€¢ If your content has small regional variations with similar content, in a single language. For example, you might have English-language content targeted to the US, GB, and Ireland. ā€¢ If your site content is fully translated into multiple languages. For example, you have both German and English versions of each page.

Source: https://developers.google.com/search/docs/advanced/crawling/localized-versions?hl=en

None of the three situations above match my case.

I prefer Google documentation or the most famous SEO tools over a website like apple. The Issue I mentioned (conflict) was flagged by SEMRUSH and MOZ auditing tools.

About Opinionated Frameworks Opinionated framework doesn't mean you should stick the the default configuration. You should be able to customize. The example that comes to my thinking is Spring Boot, one of the most famous opinionated frameworks, you can customize every single detail on Spring Boot (Dependencies, configuration).

slorber commented 2 years ago

If you have a one language website (ex: EN), then why to add alternatives ?

It may be overkill, but it is slightly simpler for our codebase to handle a 1-language website in the same way we handle n-language websites.

Does it actually cause any concrete problem?

In my case I have no problem with the way you implemented it, my issue is with customization. Since my website is provided in one and only one language, I need only canonical and no alternate pages.

Why do you need this? Can you explain how customizing the metas would solve which problem you have exactly?

We don't implement new customizations options unless we clearly understand the use-case behind it, so please explain

When it is recommended to add alternate pages ( hreflang ) Some example scenarios where indicating alternate pages is recommended: ā€¢ If you keep the main content in a single language and translate only the template, such as the navigation and footer. Pages that feature user-generated content, like forums, typically do this. ā€¢ If your content has small regional variations with similar content, in a single language. For example, you might have English-language content targeted to the US, GB, and Ireland. ā€¢ If your site content is fully translated into multiple languages. For example, you have both German and English versions of each page.

Source: developers.google.com/search/docs/advanced/crawling/localized-versions?hl=en None of the three situations above match my case.

Agree, in your case, Google means that adding that hreflang header is not really useful

But it does not mean that it harms to use hreflang when it's not useful either šŸ¤·ā€ā™‚ļø

I prefer Google documentation

Me too, but at the same time the Google doc may not always be clear, and we have to read between the lines, sometimes reading comments about Google SEO experts to understand more complex edge cases.

or the most famous SEO tools over a website like apple.

Apple and big e-commerce companies engineers probably optimize SEO very carefully, and have read the sparse weak signals comments of Google SEO experts on obscure SEO forums.

In my opinion I followed the Google doc carefully for the Docusaurus i18n, and tools like Ahrefs still report many errors that seem related to hreflang on our own website (which IMHO we implemented correctly, and we are using multiple locales this time).

image

Their support team wasn't really able to tell me exactly why their tool report errors, and the tool does not really say how to fix them either. I tried for a long time to solve these problems, but their number of error is quite random from one crawl to another, and the support team basically say things like "could you please try to disable JS on the crawler".

I don't say these tools are not valuable, but IMHO these SEO tools are not 100% reliable in my experience and may report random failures. And each tool may report something slightly different. After all, they are implementing rules in the google doc above which IMHO is not exhaustive, and trying to handl edge cases by using the obscure SEO forums where Google experts wrote some comments.

Note that the Google doc page recommends a tool at the end to validate hreflang: https://technicalseo.com/tools/hreflang/

That would be interesting to test this on your 1-language site and see if it reports anything

We have some minor warnings on the Docusaurus site, but overall it seems to be a valid implementation:

image

(a page that Ahrefs might flag as invalid can be reported valid with this tool)

The Issue I mentioned (conflict) was flagged by SEMRUSH and MOZ auditing tools.

That is the interesting part for me.

We'll need exhaustive details and screenshots of the errors reported by both these tools to understand what they say exactly.

Opinionated framework doesn't mean you should stick the the default configuration. You should be able to customize. The example that comes to my thinking is Spring Boot, one of the most famous opinionated frameworks, you can customize every single detail on Spring Boot (Dependencies, configuration).

The more options we add, the more complex a tool becomes, and the more doc our users need to read/understand.

The goal of an opinionated tool is not necessary to make everything configurable (and there are so many examples out there, like Create-React-App). We'll only add a new option if it feels really necessary, backend by a really strong use-case.

And you can actually already customize those meta headers with docusaurus swizzle. Somehow we already provide that option, it's just not a very convenient API to handle a very niche use-case.

The real question is: "is it actually harmful to add hreflang to a 1-locale sites".

I believe it's not.

If it is proved that it is, then Docusaurus should simply fix a SEO bug for 1-locale website instead of providing a new option to customize those metas.

Introducing a new option (which would be more like a "shortcut" because we already have swizzle anyway) looks useless to me if we fixed that bug.

Josh-Cena commented 2 years ago

Closing this as working as intended. If people find it problematic, they can always swizzle SiteMetadata and implement their own set of head tags.

atapas commented 11 months ago

Closing this as working as intended. If people find it problematic, they can always swizzle SiteMetadata and implement their own set of head tags.

We are facing this same issue. Ours is one locale site, i.e en. the canonical and alternatives are pointing to the same URl resulting Google to make the page non-indexable. Could you please suggest the workaround to fix it? You mentioned about SiteMetadata but how will that help, please let us know.

slorber commented 11 months ago

We are facing this same issue. Ours is one locale site, i.e en. the canonical and alternatives are pointing to the same URl resulting Google to make the page non-indexable.

@atapas we use this behavior on Docusaurus, similarly to hundreds of Docusaurus i18n sites, and afaik it doesn't prevent indexing. The problem must be somthing else, unless you prove us otherwise.

Could you please suggest the workaround to fix it? You mentioned about SiteMetadata but how will that help, please let us know.

You can swizzle @theme/SiteMetadata and override the behavior we implemented with your own logic. If you don't know what swizzle means, our doc explains it all.

We don't have a workaround to suggest because we think the current logic is fine. If you want to diverge from it you are now on your own to implement whatever you want.