me-box / databox

Databox container manager and dashboard server
MIT License
94 stars 25 forks source link

Add driver-facing store API for indicating source state #41

Open yousefamar opened 7 years ago

yousefamar commented 7 years ago

Similar to #9, but in the other direction. Thinking about recent manifest discussions over Skype and email, this could be very useful.

Currently, a driver uses a store's catalogue to "advertise" if a datasource is available. IMO, the sole purpose of store catalogues should be to list datasource metadata, not keep track of datasource state. At the very least, it would mean added complexity (e.g. being able to delete items when datasources disappear and adding them again with all their metadata when they reappear — see https://github.com/me-box/databox-store-blob/issues/42).

Similarly, we do need to be able to differentiate between when a datasource is inactive (e.g. turned off at the driver, or the source) and when it's active but just not streaming any data. I can think of a number of use cases where an app might need to know if a gap in data is due to a sensor being off vs. if there's just no new data.

So ideally there should be a (subscribable) store endpoint that a driver can POST sensor/datasource states to, and apps can GET (or receive subscription events of) these states from.


This would also mean that the store catalogue should list all datasources the store can expose, not is exposing, based the driver. This would mean that the store catalogue can be (and I believe should be) immutable. IMO we should remove store POST /cat entirely, and have a store's catalogue populated install-time by the CM/system. Am I correct in thinking that the only reason we had this to begin with was to allow the run-time addition of datasources (and not runtime addition of metadata on existing datasources)?

Because if so, I believe we should go one step further, and require that drivers (and apps) are packaged with a static catalogue.json. This catalogue essentially does the job of the proposed publishes manifest field — it lists what an app/driver can publish and any metadata about it. We can use catalogue item metadata to directly match consumers with publishers.

Every conjugate in the elements of a manifest's consumes array would essentially be a search string (with some special support for semver constraints and path regex) for catalogue item metadata. This means that we can set standard rels (with existing or custom syntax) for things like format and type (like the TAG-VERSION syntax), or leave it open for developers to annotate their sources and sinks with their own metadata, and Databox will automatically be able to match those. Essentially repeating what I wrote in last week's email discussion, but IMO this will make the whole publishes-consumes pairing really versatile and future-proof.

Thoughts and opinions on these suggestions much appreciated as always.

P.S. We could theoretically literally replace the entire manifest with a catalogue (since "catalogue metadata" is separate from "item metadata" and the manifest can be pretty much encoded into the former) but one argument against that is that a manifest is easier to read than a catalogue.

cgreenhalgh commented 7 years ago

In general i like the idea of the catalogue information being (largely) statically pre-defined and outside the driver code (e.g. so it can be audited more readily). The two exceptions I can think of are:

I would definitely keep the manifest format and the catalogue formats separate for now.

mor1 commented 7 years ago

My two penn'orth: