EdwardBetts / osm-wikidata

Match OSM entities with Wikidata items
http://osm.wikidata.link/
GNU General Public License v3.0
109 stars 20 forks source link

Match all Wikipedia pages of a certain category #530

Open simon04 opened 4 years ago

simon04 commented 4 years ago

Sometimes I'd prefer working on a specific field of interest (such as mountain peaks). Therefore I'd like to fetch all Wikipedia pages of a certain category (such as Category:Mountains of the Alps) and find matches for those. Would you consider adding support for this orthogonal approach to linking OSM and Wikipedia/Wikidata.

EdwardBetts commented 4 years ago

My first experiments in matching OSM and Wikidata started with categories. The feedback from the community was that they preferred changes grouped together geographically. That's why the system is built around places.

The candidate page now shows a list of the most common types of items, with the option to filter on the type. This is an example of mountains in Haute-Savoie, France:

https://osm.wikidata.link/candidates/relation/7407?isa=Q8502

I have an alternative system for filtering areas by type during the matching process. This can handle much larger areas than would normally be possible, because the matching process only needs to check a certain class of objects. This is currently available to me, I've not made a web user interface for it yet.

OSM has a relation that represents the Alps. I was able to run the matcher looking only for mountains within the Alps. The result is available here: https://osm.wikidata.link/candidates/relation/2698607

I'm not sure why it only finds 16 candidate matches.

Getting back to your feature request. It should be possible to use a Wikipedia category to search for items to match. The matcher will still need need a geographical location in OSM to match with. The Wikipedia category will be used as a filter.

I'm working on changing the system so that following a link on the search results page will display a confirmation page, instead of triggering the matcher to run immediately. This would display some information about the matcher job that's about to be triggered, maybe with a map. There will be options to start the match or return to the search results.

I will add filtering to the confirmation page. This means it'll be possible to filter the match run by Wikidata item type or by Wikipedia category.

EdwardBetts commented 4 years ago

Here are mountains within different Alpine mountain ranges: