culturecreates / artsdata-data-model

Overview of how data is modelled in Artsdata.ca.
https://culturecreates.github.io/artsdata-data-model/
Creative Commons Zero v1.0 Universal
12 stars 6 forks source link

Place: city, province, and country entities for places in Artsdata #61

Open saumier opened 1 year ago

saumier commented 1 year ago

As a user of the Artsdata API, I want to receive the city, country and province entities of places in Canada.

saumier commented 9 months ago

Proposing a different solution. Each place should be assigned the most precise Wikidata administrative region. Perhaps using schema:containedInPlace or a new property such as ado:broaderWikidataPlace. This property would then enable the tree of administrative regions from Wikidata to be used. In Wikidata the administrative regions include villages and cities but also "arrondissements" and greater city areas (greater Montreal) and provinces. Finally they are linked to countries. This tree of administrative regions is crowd sourced and can contain errors but appears to be the most accurate tool for open data. Open street map is another candidate. The two are complementary. This proposal is to start with Wikidata and assign automatically link the Wikidata city as a starting point. The algorithm would reconcile the city and province and country to link just the city Wikidata id.

fjjulien commented 8 months ago

The WikiProject Cultural venues' documentation is very clear with regard to location information : "For located in the administrative territorial entity (P131) statements, it is good practice to use the smallest administrative territorial entity (neighbourhood, district, ward, etc.) in which the building is located. Larger administrative territorial entities can be inferred from smaller ones." In practice, however, because neighbourhood/district information is not readily available, the most common types of values under P131 are municipalities - which is what you are looking for.

This being said, since wdt:P131 isn't used in a consistent fashion, and since users may populate inaccurate municipality values, it's ultimately not reliable enough to be served "as is" as a [schema:addressLocality] (https://schema.org/addressLocality) value.

I can think of few solutions to improve the consistency of city information pulled from Wikidata:

  1. You could implement data validation to check if the wdt:P131 is an instance of a municipality (or a subclass of);
  2. If validation 1 is FALSE, you could crawl the location hierarchy up to the first instance of a municipality;
  3. Step 1 and 2 would still leave several venues without a wdt:P131 value that is an instance of a municipality. As a complementary or alternative process, you could retrieve the postal code (wdt:P281) and perform a postal code search on Geonames' API.

I'm not sure OpenStreetMaps will be any more useful than Wikidata. Venue information in OSM is too incomplete and unreliable to perform venue queries by name. Several wikimedians are contributing to OSM to improve data completeness and mapping with Wikidata, though.

BTW: If/when you stumble upon place hierarchy issues, drop a note in the Slack LODEPA WG6 and tag Dessa and I. One of us will fix it.

fjjulien commented 8 months ago

Of note, the WikiProject Cities and towns) recommands that local administrative territorial entities (cities or towns) be instance of (P31) their government designation.

For the province of Quebec, that would be the class local municipality of Quebec (Q3327873).

This class is a subclass of municipal government in Canada (Q3788231), which is a subclass of municipality Q15284.