nystudio107 / craft-seomatic

SEOmatic facilitates modern SEO best practices & implementation for Craft CMS 3. It is a turnkey SEO system that is comprehensive, powerful, and flexible.
https://nystudio107.com/plugins/seomatic
Other
165 stars 70 forks source link

How to set up the hreflang tags properly? #881

Closed Stalex89 closed 3 years ago

Stalex89 commented 3 years ago

Question

I have a question regarding the canonical/alternative/default links setup. The structure of the website is following:

https://example.com - Default (US) https://example.com/uk - English (GB) https://example.com/nl - Dutch (NL) https://example.com/de - German (DE) https://example.com/es - Spanish (ES)

So I'm expecting something like this:

<link rel="alternate" href="https://example.com" hreflang="x-default" />
<link rel="alternate" href="https://example.com/de" hreflang="de" />
<link rel="alternate" href="https://example.com/nl" hreflang="nl" />
<link rel="alternate" href="https://example.com/uk" hreflang="en-gb" />
<link rel="alternate" href="https://example.com/es" hreflang="es" />

However the plugin generates the default site also as an alternative language:

Screenshot 2021-05-03 at 09 22 12 Screenshot 2021-05-03 at 09 22 43 Screenshot 2021-05-03 at 09 22 53

Tried to set up tags with the code manually, but the sites becomes unindexable afterwards:

Screenshot 2021-05-03 at 09 36 08 Screenshot 2021-05-03 at 09 38 02

Can you please give me a hint how to turn off the US language as an alternative? Thank you

Craft version: Craft Pro 3.5.16 SEOmatic version: 3.3.18

khalwat commented 3 years ago

This is expected behavior, and I'm pretty sure the correct way to handle the hreflang tags.

The hreflang="x-default" is the default page that should be used if no language is specified... and after that, every available translation should be listed (including the page we are currently on).

Ref: https://developers.google.com/search/docs/advanced/crawling/localized-versions#all-method-guidelines

Stalex89 commented 3 years ago

@khalwat Ok, I see, thank you for the tip.

Can I also ask: 1) how to specify some languages as a region - independent (for example, use 'es' as hreflang value instead of 'es-es'), and leave the 'en-gb' as region - specific at the same time? Is it possible to set up somewhere in the plugin or it is only possible to do manually with the code? 2) Does the order of links generation matter? (default link goes first, all alternate links next, or vice versa)

khalwat commented 3 years ago

The ordering of the links does not matter.

As for specifying the language you should be able to do that in the Sites settings in the Craft CMS CP.

Stalex89 commented 3 years ago

@khalwat ok i seems to work for me, thank you

Can I ask one more question, how does the plugin deal with query strings and pagination ? What should be the proper hreflang and canonical links for the pages ?

Because currently I have an issue of conflicting hreflang and rel=canonical, as well as missing self-referencing hreflang

Example of query strings on my website (including localisation): www.example.com?type=test, www.example.com/de?type=test Example of pagination on my website: www.example.com/p2 (where p2 is page 2)

khalwat commented 3 years ago

@Stalex89 I'm not clear on what the problem is you're experiencing here? Can you show me:

1) What it's doing now

2) What you want it to be doing

Stalex89 commented 3 years ago

@khalwat sorry for the late response, let me try to explain the issue

What it is doing now: We have a website with multiple languages (default language is EN) with the following url structure:

EN (Default): https://ctouch.eu DE: https://ctouch.eu/de ES: https://ctouch.eu/es UK: https://ctouch.eu/uk ...

We have audited the website for SEO performance and we are having several issues regarding "Conflicting hreflang and rel=canonical" and "Missing self-reference" of urls. We assume that most of the issues are caused by the following:

1) Website uses plugin [GeoMate] (https://plugins.craftcms.com/geomate) for language redirection by user location. Plugin GeoMate uses query string '__geom=✪' to detect user's redirection from one language to another (source)

Example of the link: https://ctouch.eu/?__geom=✪ Canonical link: <link href="https://ctouch.eu" rel="canonical"> Home link: <link href="https://ctouch.eu" rel="home"> Alternative links:

<link href="https://ctouch.eu/de" rel="alternate" hreflang="de">
<link href="https://ctouch.eu/es" rel="alternate" hreflang="es">
...

X-default: <link href="https://ctouch.eu" rel="alternate" hreflang="x-default">

Screenshot from site audit report:

Screenshot 2021-06-02 at 11 29 28

2) On some pages we are using the standard Craft pagination (source) which appends /p{number} to the url on particular page

Example of the link: https://ctouch.eu/academy/p2 Canonical link: <link href="https://ctouch.eu/academy" rel="canonical"> Home link: <link href="https://ctouch.eu" rel="home"> Alternative links:

<link href="https://ctouch.eu/de/akademie" rel="alternate" hreflang="de">
<link href="https://ctouch.eu/es/academy" rel="alternate" hreflang="es">
...

X-default: <link href="https://ctouch.eu/academy" rel="alternate" hreflang="x-default">

Screenshot from site audit report:

Screenshot 2021-06-02 at 11 41 50

3) Some urls with tokens are detected by crawler:

Screenshot 2021-06-02 at 11 43 33

4) Conflicting occurs when page has query parameters:

Example of the link: https://ctouch.eu/academy?type=video Canonical link: <link href="https://ctouch.eu/academy" rel="canonical"> Home link: <link href="https://ctouch.eu" rel="home"> Alternative links:

<link href="https://ctouch.eu/de/akademie" rel="alternate" hreflang="de">
<link href="https://ctouch.eu/es/academy" rel="alternate" hreflang="es">
...

X-default: <link href="https://ctouch.eu/academy" rel="alternate" hreflang="x-default">

Screenshot from site audit report:

Screenshot 2021-06-02 at 11 57 47

What you want it to be doing We would like to get rid of hreflang and canonical conflicts to improve the SEO rating.

What I read from here, I assume that all the pages that have query string in their urls and are non-canonical (which means this page is visible by Google as a duplicate of canonical link) should not have hreflang link specified (please correct me if I'm wrong, I'm not a specialist in SEO.)

Can you please give a hint how can we achieve it with the SEOmatic plugin ?

Hope that the issue is clear now, please ask me if something is still unclear. Best regards

Stalex89 commented 3 years ago

@khalwat Hello, can you please give any hints about how to get rid of hreflang. and canonical conflicts in the plugin?

khalwat commented 3 years ago

I will have a look

khalwat commented 3 years ago

We have audited the website for SEO performance and we are having several issues regarding "Conflicting hreflang and rel=canonical" and "Missing self-reference" of urls. We assume that most of the issues are caused by the following:

Okay, this is getting pretty confusing, so let's take these one by one.

First of all, what tool are you relying on here for the auditing?

  1. Website uses plugin [GeoMate] (https://plugins.craftcms.com/geomate) for language redirection by user location. Plugin GeoMate uses query string '__geom=✪' to detect user's redirection from one language to another (source)

Example of the link: https://ctouch.eu/?__geom=✪ Canonical link: <link href="https://ctouch.eu" rel="canonical"> Home link: <link href="https://ctouch.eu" rel="home"> Alternative links:

<link href="https://ctouch.eu/de" rel="alternate" hreflang="de">
<link href="https://ctouch.eu/es" rel="alternate" hreflang="es">
...

X-default: <link href="https://ctouch.eu" rel="alternate" hreflang="x-default">

Screenshot from site audit report:

Screenshot 2021-06-02 at 11 29 28

I'm not sure this makes sense to me?

From what you've stated:

Canonical link: <link href="https://ctouch.eu" rel="canonical"> X-default: <link href="https://ctouch.eu" rel="alternate" hreflang="x-default">

These URLs are identical, yet the report says "Conflicting hreflang and rel-canonical" which is wrong. Then it also says "No self-referencing hreflang, which is wrong as well, there is indeed an x-default

Are you saying that all of this is because of the fact that the GeoMate plugin adding a query string?

AFAIK, the query string shouldn't be taken into account by this auditing tool -- so what exactly is the auditing tool? Have you cross-checked it with another auditing tool?

What you want it to be doing We would like to get rid of hreflang and canonical conflicts to improve the SEO rating.

What I read from here, I assume that all the pages that have query string in their urls and are non-canonical (which means this page is visible by Google as a duplicate of canonical link) should not have hreflang link specified (please correct me if I'm wrong, I'm not a specialist in SEO.)

This link is talking about errors in Google Search Console -- are you seeing errors in your Google Search Console?

Also this link is talking about something different, where they have a canonical URL that conflicts with the hreflang URL, which your pages do not.

Can you please give a hint how can we achieve it with the SEOmatic plugin ?

Hope that the issue is clear now, please ask me if something is still unclear. Best regards

Have you tried contacting the people who make your SEO auditing tool to determine what they are expecting these results to be?

From the information provided, I'm not seeing anything wrong.

khalwat commented 3 years ago

Will re-open if more information is provided.

Stalex89 commented 3 years ago

@khalwat Hello, sorry for the long silence, the issue was "solved".

Google Search Console was showing no errors, but the Semrush (audit tool we're using) was reporting hreflang errors on urls with query parameters.

So the "solution" for us was to do meta links injection manually and only for canonical urls without the query parameters. Reference

Screenshot 2021-07-21 at 11 04 29 Screenshot 2021-07-21 at 11 04 38 Screenshot 2021-07-21 at 11 05 32

Thank you for your help.

khalwat commented 3 years ago

If Google Search Console wasn't showing any errors, I would ask Semrush's support what the expected values should be. Something doesn't sound right.

GaronRoss commented 7 months ago

@khalwat We having similar issues and wondered if this maybe a bug. The query string parameter is not getting added to the alternate hreflang.

eg. for https://www.murgitroyd.com/insights/design?page=2 the canonical is https://www.murgitroyd.com/insights/design but the alternate hreflang should be https://www.murgitroyd.com/us/insights/design?page=2 not https://www.murgitroyd.com/us/insights/design

ie. the alternate hreflang is referring to the canonical url which won't necessarily be correct.

Please let us know if we've missed something. image

khalwat commented 7 months ago

@GaronRoss so hreflang for paginated pages can be tricky. You generally don't want the hreflang to point to other paginated pages unless the content is exactly the same for each paginated page (which it usually is not), so the hreflang by default points intentionally the first page of the translated paginated pages.

Another option is that you can also choose to omit hreflang tags entirely on paginated pages, which is a valid option as well.