historical-data / schema

Microdata schema for historical data.
historical-data.org
30 stars 4 forks source link

How to handle multiple (international) pages about sames person? #40

Open coret opened 11 years ago

coret commented 11 years ago

Hi,

I've implemented part of the historical-data scheme (mainly Person) on my website Genealogie Online.

So now you have, for example on http://www.genealogieonline.nl/genealogie_mostert/I15215.php the following markup:

`

<meta itemprop="name" content="Jan Sonneveld"/>

`

But this site is multilingual, besides the Dutch version there's also an English version (and Germand and French). So the English version of the same person is http://www.genealogieonline.nl/en/genealogie_mostert/I15215.php

`

<meta itemprop="name" content="Jan Sonneveld"/>

`

On the non-Dutch pages I have used a canonical link. In the above example this is

<link rel="canonical" href="http://www.genealogieonline.nl/genealogie_mostert/I15215.php" />

Would crawlers understand these two people are the same? Or should I use "Dutch" url's only?

fleep commented 11 years ago

The way in which a crawler might interprets these Schemas I think it outside the scope of the Schema itself. I would think using different URLs is appropriate.

I'm not sure your use of a canonical URL is in-line with the way in which a search engine like Google intends for you to use it:

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139066

When you canonicalize a URL, you're basically saying "in the future, when you link to this page, use this canonical URL". If folks are searching for an english-language version of a page about a particular person, it's not necessarily to you or the searcher's advantage to tell Google to send them to the dutch record of that person.

What is it you're trying to accomplish? In any case, I don't think the Schema itself offers a way to uniquely identify people or point to other-language records of that person.

coret commented 11 years ago

I was using the canonical URL to avoid Google giving me a duplicate content penalty for the translated versions of the page. But browsing the support pages, I now see I have to use rel="alternate" hreflang="x" (http://support.google.com/webmasters/bin/answer.py?hl=en&answer=189077). So thx for that!

On the other matter: a crawler now could find 4 "Jan Sonneveld" persons, but are in fact one and the same person (Dutch/English/French/German). I think a short term solution would be to add the microdata to only one language version of the site...