osmlab / name-suggestion-index

Canonical common brand names, operators, transit and flags for OpenStreetMap.
https://nsi.guide
BSD 3-Clause "New" or "Revised" License
712 stars 875 forks source link

Wikidata sync script may need modifying to accommodate Wikidata's new multilingual label feature #10112

Open Snowysauce opened 3 weeks ago

Snowysauce commented 3 weeks ago

At some point this year, Wikidata implemented a multilingual label feature intended to cut down on the number of identical labels and aliases for items across different languages. I ran the Wikidata sync script earlier, and it came across an item where a user utilized the multilingual label feature and subsequently blanked the labels of most languages, including English. The Wikidata sync script ignored the multilingual label and added a default English label as it does for Wikidata items without multilingual or English labels. (Given how new the feature is, I admit that this was the expected behavior.)

The reason I used the phrase "may need modifying" in the issue title is that in this specific case, the chosen English label was different than the value set for the multilingual label. While we may want to avoid duplicating multilingual labels if the value the script would set for English is the same as the multilingual label, there may also be situations where the multilingual label is flawed and an override would be appropriate.

1ec5 commented 2 weeks ago

Wow, thanks for spotting this change – it could impact a lot of OSM data consumers that rely on labels for translations.

1ec5 commented 2 weeks ago

I ran the Wikidata sync script earlier, and it came across an item where a user utilized the multilingual label feature and subsequently blanked the labels of most languages, including English.

Do you recall which item this was? I’m curious what’s being affected by multilingual labels at this point.

Snowysauce commented 1 week ago

The item was Q549181.