Closed westnordost closed 5 years ago
the data is already in a pretty good format and that it would be easier if the library directly consumed the iD data.
Great - reuse of existing code/systems is better IMHO, so fully supportive of this.
Are you still up to this, @atomoil?
Yes I’m up for it. I’ve no experience with gradle so would prefer Node or Python. If you prefer Python I’ll use that. I’ll do the work in a PR and once it’s ready we can discuss the details.
I wrote a groovy script (for the gradle task) now
def targetDir = "src/main/assets/osmnames"
def presetsUrl = new URL("https://raw.githubusercontent.com/openstreetmap/iD/master/data/presets/presets.json")
def contentsUrl = new URL("https://api.github.com/repos/openstreetmap/iD/contents/dist/locales")
new File("$targetDir/presets.json").withOutputStream { it << presetsUrl.openStream() }
def slurper = new JsonSlurper()
slurper.parse(contentsUrl, "UTF-8").each {
if(it.type == "file") {
def content = slurper.parse(new URL(it.download_url),"UTF-8")
def presets = content.values()[0]?.presets?.presets
if(presets) {
def json = JsonOutput.prettyPrint(JsonOutput.toJson([presets: presets]))
new File("$targetDir/${it.name}").write(json, "UTF-8")
}
}
}
It saves the presets.json
to the target dir, queries the files in the locales
dir and iterates through them, to only save what is needed for this library to the target dir.
If there is both a python script and the groovy script (for gradle), users of this library have the choice to either use the python or the groovy script.
@atomoil Sorry for the changing requirements.
tl;dr - See last paragrah
My first idea was to have simply a json file that per locale brings together
tags
,names
andkeywords
and you were already almost done with it.But then I had a deeper look into the iD preset system - the source of the data - and decided that the data is already in a pretty good format and that it would be easier if the library directly consumed the iD data. The data in the original form, presets.json, contains quite some additional information that is useful for this purpose. I'll just enumerate these informations and why they are useful. For each preset:
geometry
. So, the results of a lookup can further be narrowed downsearchable
. Deprecated tags are not searchable by search word, but can still be found when looking for the feature name (tags -> name).matchScore
. Some hardcoded value to help sort results in a reasonable ordersuggestion
. The data from the name-suggestion-index, the brand names, are already merged into thepresets.json
with this flag set to true. That's convenient!addTags
. For brand names, these are the tags that should be added when looking for tags when providing the name (name -> tags), but when looking up the other way round (tags -> name) are not necessary to find the entrycountryCodes
. Set for features that only exist in certain countries (usually brand names). Also very useful to further narrow down the results of a lookup.icon
id andimageURL
, maybe also usefulaliases
, Aliases next to the primary feature nameAlso, I gave up both the idea to create a separate
osmnames
as a Javascript module because the iD module is already doing a very good job and thus also the idea to maybe migrate the data to another data source at a later point in time. If something should be done about iD and presets, it is to outsource the iD presets parsing and handling to a separate module so that other apps can use it. But not write something own.So, this is why I think, in the end, this library should (almost) consume the original iD data, that is, the presets.json as the unlocalized base data source and then the relevant strings from the
locales
directory. Now, the only culprit is that those json files in the locales directory contain all the strings for iD, while we only need the localizations of what's inpresets
.So, I think the best and easiest way now would be to simply define a gradle task that downloads all the current
presets.json
as-is and also all files in the/dist/locales
directory from the iD repository and puts them in/app/src/main/assets/osmnames
. Though, for the latter, the script should strip the jsons of all but the localizations for thepresets
key. This could be a simple (preferably) python script I can execute by hand, or better, it could be a gradle script defined purely in thebuild.gradle
. Not sure how to do the latter, the gradle docs have some basic examples. Are you still up to this, @atomoil?