MetPX / wmo_mesh

minimal sample to demonstrate mesh network with a pub/sub message passing protocol (mqtt in this case.)
GNU General Public License v2.0
4 stars 2 forks source link

consideration of addressable links #16

Open tomkralidis opened 4 years ago

tomkralidis commented 4 years ago

cc @efucile @alexandreleroux

In the context of WIS discovery metadata, distribution links provide valuable means to guide users to interacting with data.

Consider advertising a WMS map link within a dataset's discovery metadata:

{
    "rel" : "items",
    "type" : "image/png",
    "title" : "OGC:WMS map of this dataset",
    "href" : "https://example.org/wms/service=WMS&version=1.3.0&request=GetMap&layer={layer}&bbox={miny},{minx},{maxy},{maxx}&format={format}&crs={crs}&width={width}&height={height}",
    "templated" : "true"
}

Can we accomplish something similar with AMQP or MQTT? That is, provide a URI which allows a user to click/subscribe (albeit in a known way). It looks like there is a convention for both RabbitMQ and MQTT, which go as far as connection, but there may be value in further specifying this like:

{
    "rel" : "items",
    "type" : "application/json;subtype=x-wmo-wis",
    "title" : "AMQPS feed of this dataset",
    "href" : "amqps://anonymous@example.org/my-vhost/my-exchange/my-topic/my-subtopic"
}

Of course, non-trivial cases would go through other arrangements, but the above may be helpful as low barrier entry to PubSub advertising in WIS Metadata.

petersilva commented 4 years ago

as discussed, some relevant background:

summary of above from only two brokers, but others follow the same "trend" (really a lack of trend.):

Each broker implements different and conflicting URI/URL conventions.

So... if we want to define a feed URI we kind of have full freedom, because there is nothing to adopt:

the rest... well...

so ideally, the rest of the url would be something like:

exchange/topic/topic/filename or in MQTT just the topic hierarchy.

My first guess would be to use MQTT syntax as-is for the topic hierarchy. An important aspect of any pub-sub mechanism is wildcarding. The current proposals for topic hierarchies include dates, and the need and use of wildcards is frequent. so that must be allowed for. MQTT uses '/' as a topic separator (instead of dot (.) in AMQP ), + as a single topic wildcard (This is in AMQP), and the hashsign (#) as a match the rest of the tree* wildcard (same as AMQP)

That might be enough for simple cases... but I am not sure. I need to play around a bit to see what the above means in practice. Things I am wondering about:

that's all for now...

petersilva commented 4 years ago

@josusky of interest.

petersilva commented 4 years ago

thinking about it... can ignore baseURL, because it will come from the messages, after connection.

josusky commented 4 years ago

Hi Tom, Hi Peter, Before I dive into technical details I would like to understand the big picture. Are you suggesting to have a special feeds (brokers + topics ...) to publish some kind of "metadata" (i.e. notifications about new data sources (producers/publishers) that have appeared, and where one can subscribe to start receiving the new data)? And perhaps you expect to periodically re-publish "metadata" of all existing (known) data sources? And is your heresy going so far as to suggest that these "metadata" will not be in form of XML that conforms to a set of sophisticated ISO standards, that even the experts do not know how to correctly validate, but instead have form of a simplistic JSON? How dare you? If we were in a British film you would certainly be an Australian :-) It is Monday morning, so I might be reading it all wrong - but I like it. Just please confirm (you know: "me native speaker be not").

Now when it comes to URLs, if we have full control over its elements then no encoding is needed. For example I do not think that we need wild cards (+, #, ...) when publishing information about a new data source - I expect the source to publish each type of data under a very concrete topic. The wild cards are usually used by the consumers to simplify subscription to multiple types of data. The only problematic part is the date and that could be addressed by templating that Tom suggested. Anyway, if we do not have full control over some elements, then we need to resort to encoding. I such case we need to state which part is encoded. For example, path element that contains topic is URL encoded (let say, because it may contain slashes). In such case the "consumer" needs to first split the URL into its components and decode the part(s) that needs to be decoded. Now, when I re-read what I wrote and remembered that the whole thing is sent as JSON, I think that we could forget about URL encoding completely if we structure it like this:

{
    "rel" : "items",
    "type" : "application/json;subtype=x-wmo-wis",
    "title" : "AMQP feed of a super cool dataset",
    "broker": "amqps://anonymous:nopwd@example.org:12345"
    "vhost": "if-need-at-all"
    "topic": "topic.in.its.true.form/what/ever/it/*/is"
}

I am not sure what is the "rel", and I can imagine splitting the "broker" even more, but my main point here is that instead of constructing a complex (and possibly invalid URL) that the consumer needs to parse before actual use, we can provide each element of information separately.

tomkralidis commented 4 years ago

@josusky thanks for the feedback. The big picture would be to have PubSub links available from WIS discovery metadata in an as actionable approach as possible (publish/find/bind). Example:

On WIS discovery metadata: the current offering is WMCP (XML), however things are slowly evolving into JSON. In theory, given ISO 19115 (which WMCP is a profile of) is an abstract specification, one could define a JSON encoding for same (like ISO 19139 does for XML). The next generation OGC Catalogue Standards (OGC CSW basically becoming OGC API - Records), are making way for much simpler APIs as well as JSON as a core representation. Actually, this is happening for numerous OGC API standards (see https://ogcapi.ogc.org for more info).

Back to links, here is the current thinking around representing links in JSON in OGC API standards. For WIS purposes, we could extend as we want but putting in a proper URI would help with broad interoperability. Let's work on something in between to balance complexity and practicality.

The bigger picture (to be setup in another project/thread) is to setup a WIS 2 pilot between a few of us, using the evolving OGC API standard for discovery (OGC API - Records), for WIS metadata using JSON encoding. Having actionable PubSub would be a huge win to demonstrate easy APIs and easy discovery metadata representations to lower the barrier to access to users. I don't see XML or WMCP going away anytime soon for advanced use, but there is a huge opportunity for "the rest of us".

petersilva commented 4 years ago

What @josusky is saying works too. If topic is a separate entry, then yes, it could be interpreted based on the protocol specified in the broker, or even handed off, as-is. One of the main conclusions I have drawn from this project is that being multi-protocol is good.

Using the original mapping used in this project so far, it would look like:


{
    "rel" : "items",
    "type" : "application/json;subtype=x-wmo-wis",
    "title" : "Feed of a super cool dataset",
    "broker": "amqps://anonymous:nopwd@example.org:12345",
    "vhost": "if-need-at-all",
    "exchange": "xpublic", 
    "topic": [ "v03.post.*.DWD.its.true.form.#", "v03.post.*.CMC.its.true.form.#" ]
}

As you can see in the topic header here, the dot separator would make a really ugly URI, so putting it in a separate topic header makes a great deal of sense. and we don't need any url-encoding then.

AMQP requires an exchange to specified to perform a binding (in our applications we invented a convention of using "xpublic". Vendor implementations vary in their default names, so no portable solution is possible.)

The client connects to the broker, declares a (convention determined) named queue, and then makes a binding between the exchange and the queue, using the topics.

The topic changing into a JSON Array implements OR quite elegantly. When processed and re-published by an MQTT broker, using this project's mapping (AMQP exchange -> top of topic hierarchy). It would look like so:


{
    "rel" : "items",
    "type" : "application/json;subtype=x-wmo-wis",
    "title" : "Feed of a super cool dataset",
    "broker": "mqtts://anonymous:nopwd@example.org:5678",
    "vhost": "if-need-at-all",
    "topic": [ "xpublic/v03/post/+/DWD/its/true/form/#", "xpublic/v03/post/+/CMC/its/true/form/#" ]
}

In the MQTT case, The client connects to the broker with a client_id subscribes to topics If (as is likely) someone wants to add another protocol later, and it needs some other fields, this method leaves our options a lot more open, and it eliminates the need for encoding conventions.

tomkralidis commented 3 years ago

Here's an example from DWD (thanks @kaiwirt) in https://gisc.dwd.de/wisportal/#SearchPlace:q?pid=sd1065_wmo_test

        <gmd:onLine>
          <gmd:CI_OnlineResource>
            <gmd:linkage>
              <gmd:URL>amqps://oflkd013.dwd.de:5671</gmd:URL>
            </gmd:linkage>
            <gmd:protocol>
              <gco:CharacterString>AMQPS</gco:CharacterString>
            </gmd:protocol>
            <gmd:name>
              <gco:CharacterString>exchange: netcdf_pilot, routing_key: v03/WIS/de/offenbach_met_com_centre/observation/sea/surface/</gco:CharacterString>
            </gmd:name>
            <gmd:description>
              <gco:CharacterString>WMO Information System, pub/sub messaging for new meteorological data, download products/data via link contained in the message (baseUrl+relPath). Please ask GISC Offenbach for registration to get username/password. Topic structure based on https://github.com/wmo-im/GTStoWIS2</gco:CharacterString>
            </gmd:description>
          </gmd:CI_OnlineResource>
        </gmd:onLine>

So perhaps the equivalent in next generation WIS metadata could be:

{
  "rel": "items",
  "type": "OASIS:AMQPS",
  "title": "cool feed",
  "href": "amqps://oflkd013.dwd.de:5671",
  "wmo:exchange": "netcdf_pilot",
  "wmo:routingKey": "03/WIS/de/offenbach_met_com_centre/observation/sea/surface/"
}

Thoughts?

petersilva commented 3 years ago

for amqp there is also a concept of vhost... some brokers (rabbitmq) include that as after the port in the href (link higher in the thread.)

tomkralidis commented 3 years ago

Would vhost be better off as 1./ optional property or 2./ up to the provider to add to the (required) href property if needed?

josusky commented 3 years ago

I am not sure with the vhost but the "routingKey" is more usually referred to as topic (see GTStoWIS2).

josusky commented 3 years ago

Thus perhaps it could be, (in accordance with @petersilva 's example):

{
  "rel": "items",
  "type": "OASIS:AMQPS",
  "title": "cool feed",
  "href": "amqps://oflkd013.dwd.de:5671",
  "vhost": "if-need-at-all",
  "exchange": "netcdf_pilot",
  "wmo:topic": "v03.WIS.de.offenbach_met_com_centre.observation.sea.surface"
  "messageFormat" : "application/json; subtype=x-wmo-wis",
}

@tomkralidis Note that I have removed "wmo" prefix from the exchange as that is not WMO-specific but rather AMQP-specific thing, same as "vhost". The "topic" is a WMO-specific thing so it may deserve a namespace prefix. And I have changed "/" to "." as that is the delimiter for AMQP. But we could decide to use always "/" as a WMO standard that needs to be translated for the underlying pub-sub protocol if needed. In Peter's example, the "type" indicated the actual type of messages that the service sends. I think that that needs to be preserved somewhere too, as AMQP brokes are used to provide other types (formats) of messages too. Therefore, I have added it as "messageFormat".

(added a v to the version spec at the start of the topic)

tomkralidis commented 2 years ago

Update: in OGC API specifications, a link type is the actual MIME type. So using @josusky's latest example:

{
  "rel": "items",
  "type": "application/json",
  "title": "cool feed",
  "href": "amqps://oflkd013.dwd.de:5671",
  "vhost": "if-need-at-all",
  "exchange": "netcdf_pilot",
  "wmo:topic": "v03.WIS.de.offenbach_met_com_centre.observation.sea.surface"
}