osm-search / Nominatim

Open Source search based on OpenStreetMap data
https://nominatim.org
GNU General Public License v3.0
3.2k stars 715 forks source link

Example of Wikipedia article not being read for weighting? #1508

Closed skylerbunny closed 4 years ago

skylerbunny commented 5 years ago

There are several 'Wolf River' names out there. First, two whose Wikipedia rankings do work: place_id=198643705 (This has had a Wikipedia article for a long time. It works.) place_id=198895205 (I attached this Wikipedia article to the waterway relation just recently. It was also interpreted properly and changed the rank of the object.)

place_id=198733351 Here's one that doesn't. This item should have the article 'en:Wolf River (Fox River tributary)' attached to it, but Nominatim never actually interprets it. I don't know if it's because it doesn't understand the above string as a valid article in the first place, or if the system does parse the string itself, but then can't manage to understand details inside it. If I try a completely different article on this place ID (I tried 'en:Copper Country State Forest' just for fun), Nominatim correctly interpreted and ranked THAT article. So it is 'the exact place ID 198733351' and 'the exact article en:Wolf River (Fox River tributary)' paired that Nominatim can't read.

Perhaps this is an indication of a wider problem, where other place_ids have Wikipedia articles attached that also don't work. If so, if and when this is fixed, Wikipedia rankings should probably be rescanned in full to fix other examples proactively.

mtmail commented 5 years ago

Wikipedia rankings should probably be rescanned in full

Thanks a lot. I suspect it's missing wikipedia/wikidata data in the Nominatim database. We have new scripts https://github.com/openstreetmap/Nominatim/pull/1475 and also import scripts for wikidata but we haven't applied them yet.

skylerbunny commented 5 years ago

I'd love to see what the result of the application of those are - specifically, whether it grabs place 198733351 as 'Interesting', or whether this is something else underlying in the parsing.

lonvia commented 4 years ago

The new wikipedia importances are there. The good news your Wolf River now has the the Wikipedia link. The bad news is that the ranking is so low that it is still below unlinked rivers. We might have to do something about that but let's first see how the new importances do in general.

In the meantime you could add Wikipedia tags to the other Wolf Rivers, if possible. That will likely downrank those.

Closing as the missing Wikipedia link is solved.