PeterBrodersen / osmetymology

Etymology map based on OpenStreetMap and Wikidata
3 stars 1 forks source link

OSM Etymology

Etymology map based on OpenStreetMap and Wikidata. This is geared for Danish content.

OpenStreetMap has references to Wikidata for a bunch of Danish streets. This project aims to make the information searchable.

Overview

OpenStreetMap is a freely available map resource. Wikidata is a freely available structured data resource.

OpenStreetMap uses tags such as name:etymology:wikidata to link to Wikidata items. Using these items it is possible to show maps based on different topics such as country, gender, profession and so on. Check out an example from Open Etymology Map showing a map of Odense grouped by occupation.

Install

Requirements

This will generate the aggregated GIS table as well as supporting FlatGeobuf file (for web usage) and CSV file (for simple overview).

The import script can simply be run again to retrieve updated data. GeoFabrik usually updates around daily.

For web usage:

  1. Copy config/db.example.php to config/db.php and update the variables with your database credentials.
  2. Point your web server to the www folder.

All done!

Code

The web project is based on Leaflet with PostgreSQL as DB backend. No OpenStreetMap editing feature is planned.

The FlatGeobuf map file contains all data when clicking the map.

A search option allows users to search for street names.

Import process

The import script works as follows:

  1. Download copy of OpenStreetMap data in Denmark from GeoFabrik
  2. Download geometry of Danish municipalities from DAWA
  3. Import to PostgreSQL using osm2pgsql with Flex output for storing keys in JSON field
  4. Import Danish municipality boundaries
  5. Create aggregated table of imported data, grouping by name and etymology - no need to have several individual road segments
  6. Fetch set of every Wikidata item from the OpenStreetMap data as well as their "Instance of" items
  7. Save geometry table as FlatGeobuf file for web service as well as CSV file
  8. Profit!

The munitipality split is based on the idea that any named conceptual road should only exist once in a municipality. Every road segment for a street with a specific name should be considered the conceptually same road. OpenStreetMap does not group roads with the same name in the same area or split roads on municipality boundaries and roads do not have the official Danish muncipality+street codes (3+4 digits).

Performing the grouping and split makes it easier to answer conceptual questions such as:

In these cases it makes no sense to tally up every road segment with the same name or Wikidata item. This would result in an arbitrary count as even a straight road might consist of several individual segments with different speed limits, lane count, surface material, oneway rules, and so on.

Updating the map

The service does not provide any edit feature, however there are several editors and other services to help you. Check out e.g. MapComplete Etymology Map.

Check out the editing article for more information about caveats and issues.

Editors and data sources

OpenStreetMap and Wikidata can be edited by anyone. One of the most used editors for adding etymology data to streets and other objects is the MapComplete Etymology Map. Of course, other editors such as JOSM can be used as well for advanced users.

Some of the more active users for adding etymological data in Denmark are Søren Johannessen and Peter Brodersen.

Adding data

There are multiple options for figuring out the origin of a street name, such as:

Caveats

Some items might be deceptive, not unlike the linguistic topic of false friends. These are cases where the answer seems obvious but where the devil is in the detail.

A couple of examples:

Resources

Check the source list for different locations in Denmark.

Bugs

Probably several (check Issues). Currently the most important:

Ways outside municipality boundaries

Currently all ways are aggregated based on name and etymology and afterwards split based on intersecting municipality boundaries.

However as Danish municipality boundaries by definition do not expand over coastlines some items will be lost, such as Christian D. IV's Bro which spans a canal that is connected to the ocean.

More info: Issue #8

Other resources

Similar projects exists, such as Open Etymology Map GitHub.