culturecreates / artsdata-data-model

Overview of how data is modelled in Artsdata.ca.
https://culturecreates.github.io/artsdata-data-model/
Creative Commons Zero v1.0 Universal
12 stars 6 forks source link

Update README.md #71

Closed fjjulien closed 1 year ago

fjjulien commented 1 year ago

J'ai ajouté les propriétés "@id" et "sameAs" dans la section française de notre page de documentation. Mes instructions pour ces propriétés sont beaucoup plus élaborées que pour les autres propriétés. Cela m'apparaissait nécessaire étant donné leur grande importance et leur complexité. Lorsque ces changements à la version française auront été approuvés, je ferai les changements équivalents à la version anglaise.

fjjulien commented 1 year ago

Christian Roy has tested this method with Google and has found it to have a positive impact on ranking. Until we can offer a means of quickly minting Artsdata URIs for events, I suggest we propose this as a good-enough DIY method (along with edits to indicate that no ID is better than an ID that isn't unique). Then, when the minting API is ready, we can replace these instructions with new ones indicating how to obtain an Artsdata ID for an event. Would this work with you?

saumier commented 1 year ago

Christian Roy has tested this method with Google and has found it to have a positive impact on ranking. Until we can offer a means of quickly minting Artsdata URIs for events, I suggest we propose this as a good-enough DIY method (along with edits to indicate that no ID is better than an ID that isn't unique). Then, when the minting API is ready, we can replace these instructions with new ones indicating how to obtain an Artsdata ID for an event. Would this work with you?

@fjjulien One point I'd like to push further... about the authority of @ids. My opinion has evolved over the years to recommend that an @id (subject position) should only use URIs on the domain of the webpage. This means that Artsdata IDs and Wikidata IDs should be used only in the object position. This can be combined with a reconciliation process with Artsdata that adds sameAs IDs for Events, People, Places, Orgs.

The reason I am reluctant to have a website use Artsdata IDs in the Subject position is because there are naturally a lot of errors in the data. When there is an error with an Artsdata URI in the Subject position, it is very difficult to sort out because it is claiming something directly on a entity . When an error with an Artsdata URI happens in the sameAs position, the error is easier to detect and filter out. So my recommendation is to only add Wikidata URIs or Artsdata URIs as objects, and the subject (@id) should either be a blank node or a URI generated with the website's domain.

You may wonder how to handle blank nodes. This is indeed a challenge. The technical piece of the puzzle that I am finding works quite well is to link structured data with blank nodes (without @ids as recommended by Google) to the webpage with the structured data when the data gets crawled. This gives a consistent way to navigate blank nodes by starting with a guaranteed URI for the WebPage entity. I am using schema:mentions to link the crawled page to the structured data on the webpage. We can have a discussion on the phone if you like regarding this.

fjjulien commented 1 year ago

My opinion has evolved over the years to recommend that an @id (subject position) should only use URIs on the domain of the webpage.

@saumier I also read many specs, comments and insights that suggest the same thing. I kind of eluded to this in the sentence: "N'importe quelle chaîne de caractères peut convenir du moment où elle constitue une ancre unique à l'intérieur de la page web de l'événement."

Let me take another stab at editing the @id instructions to reflect your comments. If you're still uncomfortable with it, we'll remove this section and we will only implement the other contents about sameAs and brigde identifiers.

fjjulien commented 1 year ago

@saumier I just completed my last commit, which I believe should alleviate all your concerns. My suggestion regarding the use of anchors was moved to a footnote and I made it clear that the @id property should not be used if a URI is not available.

In the footnote about URIs, in addition to the permalink, I also mentioned the PURL as a method of generating URIs. Even though PURLs are not within the domain of the site, would you deem them to be an acceptable value for the @id property? (I am aware that no performing arts organization currently has the capacity to implement PURLs, but I'm a gardener: I plant seeds whenever I have the opportunity...)