clarify how eventid query param relates to quakeml event publicID

FDSN / fdsnws-event

The FDSN's fdsnws-event web service specification

2 stars 1 forks source link

The eventid query param would seem to be the right way to re-get or update a quakeml event using the publicID. However, there is little to no consistency between datacenters now on how this is done, meaning a client has to guess at how to translate the publicID into the corresponding query param.

For example, publicID to eventid mapping for several datacenters are below. Most have overly long publicIDs and even where it is obvious how to translate the publicID to the eventid, it is much harder than it should be.

From a client perspective, it would be most beneficial if the output publicID could be directly put into the eventid query parameter and have it work. Absent that, a consistent approach to translating publicIDs to eventid query parameters should be part of the fdsn event web service.

USGS:

An event from a query has this as its publicID:

publicID="quakeml:earthquake.usgs.gov/fdsnws/event/1/query?eventid=usc000lvb5&format=quakeml"

which suggests the event id should be usc000lvb5 and a query could be formed by replacing quakeml: with http:// like:

https://earthquake.usgs.gov/fdsnws/event/1/query?eventid=usc000lvb5&format=quakeml

but the returned event now has a different publicID

publicID="quakeml:us.anss.org/event/c000lvb5"

which is then not parsable in the same way. Here we have one datacenter with two different publicID to eventid mappings.

IRIS:

A query returns an event with publicID:

publicID="smi:service.iris.edu/fdsnws/event/1/query?eventid=3275979"

which suggests the eventid should be 3275979 and the query can be formed by replacing smi: with http:// which does work.

ISC

A query returns an event with publicID:

publicID="smi:ISC/evid=600516598"

which suggests taking the evid query parameter and using that as

http://isc-mirror.iris.washington.edu/fdsnws/event/1/query?eventid=600516598

KMI (and likely other seiscomp3 based services)

A query returns an event with publicID:

publicID="smi:scs/0.7/knmi2019cgfn"

and perhaps the eventid is knmi2019cgfn

http://rdsa.knmi.nl/fdsnws/event/1/query?eventid=knmi2019cgfn&nodata=404

According to the QuakeML 1.2 standard FDSNWS/event bases on, a publicid is built as follows: [smi|quakeml]:authority-id/resource-id[#local-id] authority-id is assumed to be unique per issuer of quakeML publicIDs, while the resource-id part should provide uniqueness among all quakeml objects (i.e. not only events, but also origins, picks, ...) with IDs of this issuer.

Under the assumption that these days, the organisation and domain names of issuers may be quite stable (and not overlapping), the QuakeML standard recommends for the authirity-id the following scheme: <top-level domain>.<organisation/institution>[.sub-unit of organisation] The fact that this is inverted from the organization's address e.g on web or mail servers should prevent confusion of this identifier with an addressable technical resource (URL)

The resource ID part is within the responsibility of the organization, so the QuakeML standard does not provide recommendations. However you may consider the following points:

as the PublicID is meant to be unique among all types of present (and future) QuakeML objects, you may have it containing a reference to the object type, as well as some numeric or alphanumeric enumerator (in order for the enumeration scheme not to collide with enumerator values of different object types).
you may want to give some more context like a catalog, creation environment etc, just to anticipate the fact that some time in future you may want to independently, and potentially overlappingly, enumerate objects of the same type in a different catalog, creation environment etc.
The PublicID should identify a real world object (here: an earthquake) for all times, without needs to change, and without real or apparent reference to a technical representation of the information. Thus a change of the technical representation of the object neither raises the wish for modifying the ID (signaling to the user that it is actually a different earthquake), nor wrong expectations on how and where to find the object.

Based on that, an event publicID may e.g. (but not exclusively) look as follows: smi:org.myearthquakeservice/earthquake/rt-autoloc/abcd1234, or smi:org.myearthquakeservice/eqcats/Smith2008#abcd1234

Looking at the real world examples:

quakeml:us.anss.org/event/c000lvb5 is closeest, except for the organisation part not exactly following the recommendation, and ID part not providing an obvious option for potential future separate event sets.

smi:service.iris.edu/fdsnws/event/1/query?eventid=3275979 is mixing resource identification with resource location: once you move to data distribution via fdsn/event/2/, should one change the ID of the event (with the user misinterpretation that it is a new event, or should one risk the user to look for more information on this event at a place that actually does not exist any more?)

smi:scs/0.7/knmi2019cgfn misses the purpose of the agency part of the ID - ok, they imply that "knmi" at the beginning of the resource ID makes the ID to belong to them, but somebody having just an alphanumeric counter as resource-id will also cross the combination of k, n, m, and i some time, even without implying that the earthquake is Dutch...

Now, this all is not excatly answering the question of Philippe, and I actually think that there is no unique answer (and none can be enforced):

From the point of view of QuakeML, the PublicID is the primary, globally unique identifier
Pre-existing "local" (agency or catalog-specific) IDs were generated with different scope, and may cover different subsets of the tasks of the global ID, thus suggesting different mapping patterns.

FDSN / fdsnws-event