osm-search / Nominatim

Open Source search based on OpenStreetMap data
https://nominatim.org
GNU General Public License v3.0
3.13k stars 714 forks source link

Subdomain place_id.openstreetmap.org to redirect Nominatim, a PURL of places #757

Closed ppKrauss closed 7 years ago

ppKrauss commented 7 years ago

The Openstreetmap place_id is open source and, as free, the most reliable and stable place-identifier of the Intenret!
Today is the URL template: http://nominatim.openstreetmap.org/details.php?place_id={value}

PROBLEMS: it is not a CoolURI, and it is not so short. It depents on a script name, details.php that can be changed in the future... So it is not persistent (eterne), can't be used in a QR-Code, CD-ROM or PDF link to be linked 5 or 10 years later.

SOLUTION SUGGESTION: redirect nominatim.openstreetmap.org/details.php?place_id={value} from place_id.openstreetmap.org/{value} to be a PURL (Persistent URL) of places.

Examples

lonvia commented 7 years ago

There are two fundamental misconceptions here, I'm afraid.

First, the Place ID returned by Nominatim is an internal identifier used to reference the current objects in the database. It is not necessarily stable over time and certainly not the same between different installations. The closest we have to a identifier that is stable across installations is the combination of osm_type/osm_id/class. However, given that OSM ids are not meant to be stable either, this is still not an identifier that is stable over time.

Second, the details page is meant as a debugging tool for understanding how Nominatim computed its data internally. It is not mean to be linked to by external applications. It is certainly not guranteed to be stable in any way.

ppKrauss commented 7 years ago

Hi @lonvia, thanks the reply, and sorry by my misconceptions... I was "groping the dark" (my poor knowledge of Nominatim) and perhaps oriented by other places/place-id in mind... So, next post #758, was for check and suggest more objectivally (and you answered that stable ID is outside the scope for Nominatim).

Can I try to redo the suggestion? Lets simplify and reduce scope. The contry-codes are like place-ID's, with the advantage that they are "transparent IDs" (concept), and very stable.

Country codes are in the core of Nominatim (as suggested by country_name.sql and stable maps since 2010), so we can implement a kind of experimental paramter, as place_urn for details.php to get place by "place's transparent ID", ex. details.php?place_urn=br.

... Not only a suggestion, I would like to collaborate as PostgreSQL and PHP programmer.

So, if this first, little and less ambitious suggestion works fine... I can grow to the next step, the ISO 3166-2 subdivisions (eg. DE 16 states, BR 27 subdivisions, etc.). Example: details.php?place_urn=br:am to retrive Amazon state of BR.
PS: the hierarchy.php perhaps helps me to understand how get it.

freyfogle commented 7 years ago

Hi,

I've done some work involving ISO 3166-2 codes and OSM data. It's harder than you would think. Last year France for example re-organized their internal divisions merging some, etc. The problem is that ISO codes update on a different schedule so there was a period where the ISO codes were out of sync with the legal reality - that may still be the case, I haven't checked in a while.

You also have cases where subdivisions have an ambiguous status - for example Puerto Rico. It has it's own ISO 3166-2 code "PR" despite being part of the United States "US". The world is full of these historical anomalies, each different than the others. See for example the "Realm of New Zealand" or the various Dutch sub-countries, etc, etc, etc

My advice is your time and technical abilities could be better invested on other more urgent projects.

ppKrauss commented 7 years ago

Hi @freyfogle, sorry my English, lets try to explain.

I've done some work involving ISO 3166-2 codes and OSM data. It's harder than you would think. Last year France for example re-organized their internal divisions merging some, etc. The problem is that ISO codes update on a different schedule so there was a period where the ISO codes were out of sync with the legal reality - that may still be the case, I haven't checked in a while.

Yes, we known the problems... But we not need to wait years for ISO committees: we are the community of corators that decide about our "canonical names", at eg. https://github.com/datasets/country-codes We are a kind of fact-checking curators at https://github.com/datasets

The concept of canonical name is not strict "ISO name" but something "as ISO as possible" for the community, including the OpenStreetMap community.

Of course, we have no ambition to go far beyond what the country has been: if there are no ISO 3166-2 codes for contry-subdivisions, the country have no data. But 90% of countries have ISO subdivisions, that's enough: we do not need to wait, see br-state-codes.csv. See the fields "creation" and "extinction", that ensures disambiguation. Is used, eg., for jurisdiction disambiguation in a URN LEX.

... I think we, as Open Knowledge community, have full control about our canonical names. And "we" as OSM, LexML, and any other community, that need its canonical names.

You also have cases where subdivisions have an ambiguous status - for example Puerto Rico. It has it's own ISO 3166-2 code "PR" despite being part of the United States "US". The world is full of these historical anomalies, each different than the others. See for example the "Realm of New Zealand" or the various Dutch sub-countries, etc, etc, etc.

The anomalies can be treated with RDF, that is like a fact-checking to reinforce canonical name... Wikidata is a good reference of stable ID, eg. Puerto Rico is Q1183 at Wikidata. The canonical name Puerto Rico have the canonical abbreviation PR and both are reinforced by the association with the canonical ID of the concept, Q1183.

My advice is your time and technical abilities could be better invested on other more urgent projects.

I like Nominatim. My urgence is to link canonical names (as jurisdiction in the URN LEX) with maps ;-)

PS: I imagine supporting today the Nominatim, one day Openstreetmap may be an organization that will assign URNs... see the fresh RFC 8141 of 2017.

mtmail commented 7 years ago

I understand and support the idea. Canonical names, starting from countries and eventually every place is something to aim for. Wikipedia does it with wikidata ids (which are often linked from OSM places).

The Openstreetmap place_id is open source and, as free, the most reliable and stable place-identifier of the Intenret!

That place_id more or less starts at 1 and whenever we recreate the database the ids change. The ids aren't even the same between servers (The nominatim.osm.org service has two servers for redundancy). It's not permanent and we warn users from treating it as such.

So I don't see Nominatim as good fit for your project. A new process/logic/system should create/assign/verify/enforce/life-cycle-manage permanent id. Nominatim can later be used to search by that permanent id of course (if it's part of the OSM data).

mtmail commented 7 years ago

I suggest to further discuss the idea on the https://lists.openstreetmap.org/listinfo/talk or https://lists.openstreetmap.org/listinfo/dev mailing list.

ppKrauss commented 6 years ago

Hi @mtmail , you say "That place_id more or less starts at 1 and whenever we recreate the database the ids change (...)", but now I see that we can suppose that ID is stable (!), see relation IDs used at Wikidata ... perhaps used since 2014 or before (seems 2013) without lost: https://www.wikidata.org/w/index.php?title=MediaWiki:Gadget-AuthorityControl.js&oldid=179329592

It is used as "ethernal ID" at https://www.wikidata.org/wiki/Property:P402