fisharebest / webtrees

Online genealogy
https://webtrees.net
GNU General Public License v3.0
462 stars 299 forks source link

SEO: add <link rel=canonical> to default.phtml #4873

Closed FrankWarius closed 11 months ago

FrankWarius commented 1 year ago

Since September 2022 I tried to improve the Google index coverage - almost 1 year without noticeable success. In July 2023 I added canonical link to default.phml and I could see a significant improvement in Google crawl rate and index coverage.

image

<!--set caonical link-->
<?php if (isset($record)): ?>
      <link rel="canonical" href=" <?=e($record->url())?> "> 
<?php endif ?> 

https://github.com/FrankWarius/webtrees/blob/93535fd89c43a4676da50568e087cdffab6751c8/resources/views/layouts/default.phtml#L47 I'm not absolutely sure yet if it will help permanently, but it doesn't seem to do any harm.

fisharebest commented 1 year ago

Google's documentation says that <link rel="canonical" ...> is only needed when the same page can be reached by more than one URL.

https://developers.google.com/search/docs/crawling-indexing/consolidate-duplicate-urls

FrankWarius commented 1 year ago

yes, that is correct. But I switched to WT 2.0.12 between 19 May 2020 and 14 March 2021 (according to log site preference "LATEST_WT_VERSION") and changed to pretty urls in the process.

Currently Google crawled 16% redirects (Http 301 on my site) like.

(see excerpt below)

Since rel="canonical" is set, the redirect rate is reduced and the index coverage slowly increases.

This seems to be primarily a problem of Google.

Zeit URL
20.09.23, 03:45 https://wbt.warius.info/individual.php?pid=I734&ged=Warius
20.09.23, 03:00 https://wbt.warius.info/individual.php?pid=I1301&ged=Warius
20.09.23, 02:15 https://wbt.warius.info/individual.php?pid=I2018&ged=Warius
20.09.23, 01:33 https://wbt.warius.info/individual.php?pid=I3&ged=Warius
20.09.23, 00:46 https://wbt.warius.info/source.php?sid=S149&ged=Warius
19.09.23, 19:39 https://wbt.warius.info/individual.php?pid=I2358&ged=Warius
19.09.23, 15:02 https://wbt.warius.info/individual.php?pid=I9836&ged=Warius
19.09.23, 10:52 https://wbt.warius.info/individual.php?pid=I1248&ged=Warius
19.09.23, 06:30 https://wbt.warius.info/individual.php?pid=I836&ged=Warius
19.09.23, 04:15 https://wbt.warius.info/individual.php?pid=I1283&ged=Warius
19.09.23, 04:08 https://wbt.warius.info/tree/Warius/family/X2161/Johann-Peter-Alexander-Berger-von-Lengercke-Auguste-Louise-Burmeister
18.09.23, 18:53 https://wbt.warius.info/individual.php?pid=I5373&ged=Warius
18.09.23, 12:43 https://wbt.warius.info/individual.php?pid=I5571&ged=Warius
18.09.23, 12:24 https://wbt.warius.info/individual.php?pid=I11530&ged=Warius
18.09.23, 11:39 https://wbt.warius.info/individual.php?pid=I1301&ged=Warius
18.09.23, 11:22 https://wbt.warius.info/family.php?famid=F4449&ged=Warius
18.09.23, 01:07 https://wbt.warius.info/individual.php?pid=I11530&ged=Warius
18.09.23, 00:58 https://wbt.warius.info/individual.php?pid=I4439&ged=Warius
18.09.23, 00:41 https://wbt.warius.info/
18.09.23, 00:01 https://wbt.warius.info/family.php?famid=F4432&ged=Warius
17.09.23, 12:01 https://wbt.warius.info/tree/Warius/source/X1674/Mannheim-ev-Baden-Landeskirchliches-Archiv-Karlsruhe
17.09.23, 02:49 https://wbt.warius.info/individual.php?pid=I7902&ged=Warius
16.09.23, 20:50 https://wbt.warius.info/index.php?route=/tree/Warius/individual/I10313/Carl-Andreas-Christian-Peters
16.09.23, 16:14 https://wbt.warius.info/tree/Warius/source/S307/Hamburg-Sterberegister-1876-1950
16.09.23, 15:29 https://wbt.warius.info/individual.php?pid=I4033&ged=Warius
16.09.23, 01:44 https://wbt.warius.info/source.php?sid=S307&ged=Warius
15.09.23, 15:39 https://wbt.warius.info/individual.php?pid=I1354
15.09.23, 06:58 https://wbt.warius.info/tree/Warius/source/X1992/Bodenteich-Kr-Uelzen-ev
15.09.23, 03:23 https://wbt.warius.info/family.php?famid=F360&ged=Warius
15.09.23, 02:06 https://wbt.warius.info/tree/Warius/family/F360/Christian-Joachim-Kobabe-Maria-Sophia-Caroline-Spangenberg
FrankWarius commented 1 year ago

Google's documentation says that <link rel="canonical" ...> is only needed when the same page can be reached by more than one URL.

and Chrom for developers says https://developer.chrome.com/docs/lighthouse/seo/canonical/?utm_source=lighthouse&utm_medium=lr:

Using canonical links has many advantages:

  • It helps search engines consolidate multiple URLs into a single, preferred URL. For example, if other sites put query parameters on the ends of links to your page, search engines consolidate those URLs to your preferred version.
  • It simplifies tracking methods. Tracking one URL is easier than tracking many.
  • It improves the page ranking of syndicated content by consolidating the syndicated links to your original content back to your preferred URL.
fisharebest commented 11 months ago

Google says there are three ways to indicate a canonical URL.

Use redirects to tell Googlebot that a redirected URL is a better version than a given URL.

If a URL is not canonical, then webtree sends a 301 or 308 "Moved permanently" redirect. Google understands both types

For example,

Specify your canonical pages in a sitemap.

webtrees provides this.

Add a element in the code for all duplicate pages, pointing to the canonical page.

Note that this says adding the <link> element to the DUPLICATE pages, not to the canonical page.

webtrees does not generate pages at duplicate URLs (it sends redirects), so there are no pages where we would add <link>.

FrankWarius commented 11 months ago

Greeg, I can't say it enough, you do really great work, but this is really absurd.

You quote Google's documentation correctly. That's why I hesitated about 4 months to implement the fix on a test basis in June. You are right, Webtrees is doing everything correct, but I think Google has a bug and we need a workaround. That's why index coverage is significantly better on all other sites than Google. (I have to statisics on selected pages of the example sites).

I know and never claimed that Webtrees creates duplicate pages.

As said at the opening of the issue, "canonical" does not bother further but it helps as the statistics clearly show.

Screenshot 2023-10-29 222109

In the forum open discussion many share their concern that their Webtrees pages are not sufficiently present on Google.

For my site I found the solution, you should provide it to everyone.

With all due respect and thanks Frank