pelias / whosonfirst

Importer for Who's on First gazetteer
MIT License
26 stars 42 forks source link

display concordances in addendum #527

Closed missinglink closed 2 years ago

missinglink commented 2 years ago

display concordances in addendum where available.

resolves https://github.com/pelias/whosonfirst/issues/526

missinglink commented 2 years ago

Looks good:

Screenshot 2021-08-17 at 09 34 27

Other addendum fields are prefixed with the source, I felt like it didn't really make sense this time since concordances represent the interoperability of disparate sources 🤷

Of course this means we're adopting the whosonfirst key schema, which I think is fine.

Screenshot 2021-08-17 at 09 34 31
missinglink commented 2 years ago

I've added a couple more commits to map fields outside wof:concordances where is makes sense, this increases the number of IDs available:

Screenshot 2021-08-17 at 10 19 10

related: https://github.com/whosonfirst-data/whosonfirst-data/issues/1956

missinglink commented 2 years ago

Fix handling of underscore vs colon delimiters:

Screenshot 2021-08-17 at 10 46 18
orangejulius commented 2 years ago

Nice, this will actually be very good to have.

If I were to make one change it would be for us to build a mapping of the short name to a longer name, so that it would be something like this instead:

{
  "concordances": {
    "geoplanet_id": 667027,
    "wikidata_id": "Q64",
    "geonames_id": 6547384,
    "quattroshapes_id": 630199
  }
}

Otherwise we will likely have to do some work to document the possible keys everywhere, as I doubt most people will know what they mean. I was actually surprised to see in the WOF schema you pasted that there's only 7 options, so at least this won't be too much work.

I don't think it's too late to make this change if you also agree with it :)

missinglink commented 2 years ago

Yeah I considered doing that but I didn't want to get into defining naming conventions 😆 Agreed that it's unlikely that users will be able to understand the acronyms.

I'm not really sure I understand which concordances are available myself! Some ones not listed above are qs_pg:id which I guess would be quattroshapes_points_gazetteer_id and wd:page, I'm not sure which other ones there are so decided to simply adopt the WOF schema for now.

missinglink commented 2 years ago

Eventually I would like to see all the datasets outputting the same keys:

Screenshot 2021-08-17 at 14 47 44
missinglink commented 2 years ago

It's never too late, but I think it would pay to define that schema up-front, the addendum isn't a public API per-se so it can evolve over time.