osmandapp / OsmAnd

OsmAnd
https://osmand.net
Other
4.65k stars 1.02k forks source link

[Feature request] Add precomputed name:* tags from wikidata in map preparation step #7253

Open stalker314314 opened 5 years ago

stalker314314 commented 5 years ago

Goal is to localize as many entities as possible to as many languages as possible by leveraging wikidata tag.

How: during map preparation step, for each entity that have wikidata tag, iterate for all wikidata entries for all languages, and add missing name:* for that entity. This tag needs to be tagged as "computed", such that, it is ignored when showing " edit POI" from OSMAnd.

OSM is very clear that unneeded transliteration should be avoided (https://wiki.openstreetmap.org/wiki/Names#Avoid_transliteration), and gives example of data consumers decorating map data. This is idea how OsmAnd can do it too. Imagine world base map in your native language, for all entities:)

Note that this is not substitute for transliteration, as transliteration can cover much more entities, but wikidata can provide 1) more correct transliterations and 2) it is independent to script/language pairs. IMHO, both approaches are needed.

Considerations:

Two steps to pull this, IMHO:

  1. add support for (generic) computed tags
  2. add name:* tags in OsmAndMapCreator 2.a. Maybe only to world basemap as a start?
vshcherb commented 5 years ago

world basemap doesn't need it cause mostly names of large cities have special translation and they arlready in the map

stalker314314 commented 5 years ago

While I admit I am layman in OsmAnd, I was looking my way around this a bit. So, if my reasoning is correct, this could be done in following steps. Some questions are in bold.

  1. Add support to add name:* to MapData:stringNames during OBF creation (for non-worldbase map only). Generation would be guarded through new field in IndexCreatorSettings (propagating to batch.xml). These computed tags/names would be treated same as any other name. Since they cannot be used for "edit POI" functionality, they should be fine to be added. This is under assumption that entities in MapData can only be rendered, never edited? This alone will greatly improve map localization, I think.
    • If this proves fine, maybe add it to worldbase map too
  2. This part I am not quite sure about. Other entities that can render text comes from POIs and buildings (maybe something else?), but it will require adding new flags to signify computed columns. To give examples with POIs, it would mean adding new field in OBF format, namely
    message OsmAndPoiBoxDataAtom {
        ...
        repeated uint32 computedTextCategories = n;
    }

    This is only idea that came to me that we can add computed columns in a) backward-compatible way, b) performant way and c) without changing a lot of stuff.

    • OBF generation would be changed such that we can pass back and forth computed names easily during serialization. Probably, there would be another constant besides private static final char SPECIAL_CHAR = ((char) 0x60000); to signify computed name, when it is encoded. Also, MapRulType will obtain isComputed (better than creation new rules with some fake tags).
    • In OsmAnd side, rendering text would not be changed. "Edit POI" would need to be changed to exclude computed names when showing edit dialog and when submitting it to OSM. I am not sure what other parts of OsmAnd needs to be adapted to these computed columns.

If assumptions I took are correct, this would also be order of how to pull this off. Part 1 and 2 are independent. Please answer my questions and correct my thinking:)