ad-freiburg / osm2rdf

Convert OpenStreetMap (OSM) data to RDF Turtle, including the object geometries and predicates geo:sfContains and geo:sfIntersects. Weekly updated downloads for the whole planet (~ 40 billion triples) and per country.
https://osm2rdf.cs.uni-freiburg.de
GNU General Public License v3.0
18 stars 5 forks source link

Convert OpenStreetMap wikimedia_commons tags to IRIs #74

Open 1ec5 opened 5 months ago

1ec5 commented 5 months ago

In OpenStreetMap and OpenHistoricalMap, the wikimedia_commons key can be set to a page name on Wikimedia Commons. The page is typically a file description page in the File: namespace, but sometimes it’s a category in the Category: namespace or a gallery in the main namespace instead. It would be convenient to have osm:wikimedia_commons triples that point to sdc: entities for files. Categories and galleries are usually linked to Wikidata items in the same manner as Wikipedia articles, so I suppose there would be schema:about triples for those.

One use case is to associate map features with Commons images for a better sense of context. OSM-based applications like OsmAnd can fetch the image and associated licensing information using direct MediaWiki API calls because they’re working with a single map feature at a time. OpenHistoricalMap/issues#581 tracks something similar that would be built into the OHM website. But there’s also some value in being able to query for linked images en masse. For example, a query could return map features whose wikimedia_commons tag:

hannahbast commented 5 months ago

@1ec5 Thanks for the suggestion. For example, here are two triples of the predicate osmkey:wikimedia_commons in the current dataset, one with an object that starts with "File:" and one with an object that starts with "Category:" (for osm-germany, there are only 31 triples where it's neither):

osmnode:773640801 osmkey:wikimedia_commons "File: Berlin-Mitte 10-2012 View from Panorama Point img01.jpg"
osmway:23492927 osmkey:wikimedia_commons "Category: Feuersteinfelder"

So how should be triples you are envisioning look like and how do or should the triples look like that connect these to https://qlever.cs.uni-freiburg.de/wikidata or https://qlever.cs.uni-freiburg.de/wikimedia-commons ?

1ec5 commented 5 months ago

Node 773640801 and way 23492927 are fairly unusual in that they contain a space after the namespace (which MediaWiki normalizes anyways). Nevertheless, it would be nice to be able to write something like:

osmnode:773640801 osm:wikimedia_commons ?file . // sdc:M22277555
SERVICE <https://qlever.cs.uni-freiburg.de/api/wikimedia-commons> {
  ?file wdt:P1259 ?geometry .
}
osmway:23492927 osm:wikimedia_commons ?category . // wd:Q1409723
SERVICE <https://qlever.cs.uni-freiburg.de/api/wikidata> {
  ?category wdt:P625 ?geometry .
}

However, I don’t know whether it’s a good idea or not to use the same predicate for both files on SDC and items on Wikidata.

1ec5 commented 3 months ago

Refers to an image that depicts something completely different than the map feature

For example, a mapper wanted to query for coordinate mismatches between the two datasets, but I was unable to come up with anything that works.