OSGeo / PROJ

PROJ - Cartographic Projections and Coordinate Transformations Library
https://proj.org
Other
1.75k stars 788 forks source link

Add Wikidata concept identifier at pj_list #3069

Open ppKrauss opened 2 years ago

ppKrauss commented 2 years ago

The pj_list is the official "Full list of current projections", and connects the PROJ-label (that is an internal identifier) to its name. The proposal here is to connect the label also to a public identifier, to enhance semantic discover, projection taxonomy, etc. Example:

PROJ_HEAD(aea, "Albers Equal Area", "Q1587118")
PROJ_HEAD(aeqd, "Azimuthal Equidistant", "Q1753196")

Can be automated to produce documentation (e.g. by templating), for example this table with links to Wikidata:

PROJ Label name (link to Wikidata)
aea Albers Equal Area
aeqd Azimuthal Equidistant

It's easy to check Wikidata-ID. There are many alternatives:

  1. Find correct projection name or concept at the List of map projections. At the projection's page, check "Tools/Wikidata Item" menu at the vertical side bar.
  2. Use a SPARQL query. (click the "PLAY" button)
  3. ... use yor favorite search engine.

If it not exists on Wikidata, or needs a "concept specialization", you can edit Wikidata (collabore!)...
Or say here at Git/Issues (e.g. I can help).

busstoptaktik commented 2 years ago

That's not a bad idea, and I'm sure a pull request will be much appreciated. The PROJ_HEAD macro, however, is an (elegant, but opaque) hack, originally devised by Jerry Evenden to facilitate customized builds with reduced projection support, so changing it consistently is a very intrusive operation, where you need to edit every single coordinate operation implementation file (as the material in pj_list.h is really just an assemblage of the material from the individual implementations.

But you may be able to get started and get some value from just adding stuff in the pj_list.h file: The material will not be passed onto the executables, but at least it will be available in the code base, as a non-intrusive start, not changing the observable behaviour of the system, but adding the metadata to the source.

I may have missed some aspects (@kbevers, @rouault?) , but it appears to me that you may simply change this line in list.cpp: https://github.com/OSGeo/PROJ/blob/e3d7e18f988230973ced5163fa2581b6671c8755/src/list.cpp#L12 to

#define PROJ_HEAD(id, name, extern_id) extern "C" struct PJconsts *pj_##id(struct PJconsts*);

telling the macro to be aware of the extern_id, but also to ignore it for now.

Then edit all the lines in pj_list.h from this format:

PROJ_HEAD(aea, "Albers Equal Area")
PROJ_HEAD(aeqd, "Azimuthal Equidistant")

to (almost) your suggested format. Note that I have left out the quotation marks from your external identifiers: When needed as a string we can use the C preprocessor "stringification"-operator , whereas there is no "unstringification" operator taking us the other way:

PROJ_HEAD(aea, "Albers Equal Area", Q1587118)
PROJ_HEAD(aeqd, "Azimuthal Equidistant", Q1753196)

At least a step toward added discoverability, and a good start for later experimentation wrt. how the external ids should be exposed at run time.

rouault commented 2 years ago

My main question is why Wikidata would be more authoritative (I had no idea it existed before) than other sources, including PROJ's own short id, or EPSG method ids, or DOI of papers presenting a projection, or ... ?

kbevers commented 2 years ago

I'm skeptical too. Especially since we seem to have more projections in our inventory than Wikipedia does (at least when compared to the SPARQL query)

busstoptaktik commented 2 years ago

My main question is why Wikidata would be more authoritative

Wikidata is not the crux here, and it would not necessarily be "more" authoritative. Actually, it would not be authoritative at all. EPSG method ids, once they are properly mapped to entries in the ISO Geodetic Registry, are probably the only kind of id that could in any reasonable way be called authoritative.

But that really doesn't matter much here: The compiled version of the material is all governed by the PROJ_HEAD macro, and we could define it to handle any number of external identifiers which anyone would volunteer to add:

#define PROJ_HEAD(id, name, wikidata_id, epsg_id, isogr_id) extern "C" struct PJconsts *pj_##id(struct PJconsts*);

would support both PROJ, Wikidata, EPSG, and ISO-GR ids, without doing any kind of damage to the compiled expression of the code, while providing the means to construct a curated list of identified equivalences between the registries (which is useful by itself).

It doesn't do any harm, and it may be useful, even when seen just as a list of human readable text, so if anyone wants to volunteer providing it, I do not think we should stand in their way. It is even an opportunity for non-coders to contribute, since it mostly involves the foot-work of looking up and checking material from external resources.

busstoptaktik commented 2 years ago

I'm skeptical too. Especially since we seem to have more projections in our inventory than Wikipedia does

There definitely will be a number of UNKNOWN indicators - which, on the other hand, may inspire others to actually implement the corresponding Wikidata entries. It works both ways...

rouault commented 2 years ago

Actually thinking that the EPSG ID can be tricky as we have several submodes for a given PROJ method that corresponds to several EPSG IDs (like "+proj=lcc" that can map to LCC_1SP, LCC_2SP, LCC_2SP_Belgium, LCC_2SP_Michigan, etc.)

busstoptaktik commented 2 years ago

Actually thinking that the EPSG ID can be tricky

Definitely - but I do not think it will lead to anything that cannot (as a first measure) be solved by adding a null-macro (PROJ_TAIL?), to register-but-not-use additional registrations.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.