SignalK / specification

Signal K is a JSON-based format for storing and sharing marine data from different sources (e.g. nmea 0183, 2000, seatalk, etc)
Other
91 stars 69 forks source link

RFC 0005: Streaming values over a range of transports #294

Open pod909 opened 7 years ago

pod909 commented 7 years ago

RFC 0005: Streaming values over a range of transports

Authors: Ric Morris, Tim Mathews Status; DRAFT Version: 0.6 Date: 09/12/2016

This document contains ideas and content first suggested by Teppo Kurki, Paul Sumpner, Sailoog, Toqduj, Jim Boynes, Rob Huitema, Brian P and others.

Summary

This document describes a consistent approach to communicating updates to individual values, notification and actions in the Signal K namespace over a range of transports using delta format messages.

The use of an existing intermediate transports is proposed where ever a specific challenge exist.

e.g.

If there is a need to restrict the range of messages received then a subscription based protocol such as MQTT should be used.

Preference is given to open standards with free and well supported libraries for a minimum of C/C++ and JavaScript.

Motivation

We have yet to see a single transport with end to end support from low power/capability device all the way to the browser. Browser and APP implementations focus on Web Sockets which are unavailable at the device level unless tied to a proprietary cloud service. Support for the kind of protocol that makes sense on a low power low capability device would force unnecessary compromise in a more full featured environment.

As the marine device sector is by its nature low volume careful consideration needs to be given to minimising the cost of implementation. In general any barrier to easy adoption is to be avoided.

Because of these issues there is a strong desire from the community to send and receive Signal K values over a range of transports. This is consistent with the transport agnostic approach to developing Signal K to date. There is a danger with this approach that a radically different set of rules may be defined for each transport, making implementation and maintenance more difficult.

Interest has been shown in serial, plane TCP socket, Web Socket and MQTT and the features of these transports have been considered.

Browsers are limited to Web Sockets in terms of the streaming protocols they support and, in order to maintain the ability to deliver thin clients in the browser, there is a requirement to push UI function down to the server. Additional features are defined when using Web Sockets as the intermediate transport to support this.

Examples

Updates grouped into a single delta message.

{"updates":[{"source":{"device":"achme-thing-1234","type":"signalk-delta"},
    "values":[{"$path":"navigation.courseOverGround","value":14.5},
    {"$path":"navigation.speedOverGround","value":6.7}]}]}

Definitions

Leaf A value or a notification JSON object

Message A delta JSON object used to communicate updates to leaves.

Provider A network node that sends messages.

Consumer A network node that receives messages.

Service A server that acts as a provider or a consumer.

Requirements

GENERAL APPLICATION: The conditions described in the JSON schema apply to the objects they define.

1. Connection

1.1 Discovery

APPLICATION: The following conditions apply when a service is made available over TCP or UDP.

A service SHOULD be discoverable using mDNS/DNS-SD. (SHOULD 1.1)

IF a service is discoverable using mDNS/|DNS-SD; it MUST use a service name made up of: A “_signalk” B followed by the intermediate transport, if any e.g. -mqtt C followed by “._tcp” or "._udp" (MUST 1.2)

e.g.

_signalk._tcp _signalk-mqtt._tcp _signalk-serial._tcp

1.2 Default ports

APPLICATION: The following conditions apply when a service is made available over IP.

Unless otherwise specified by an intermediate transport services SHOULD be made available on port 8375. (SHOULD 1.2.1)

2. Data integrity

2.1 All services

APPLICATION: The following conditions apply to all services bar -serial.

Services MUST be provided over a transport that guarantees data integrity. (MUST 2.1.1)

Network nodes that interact with a service CAN assume that data integrity has been maintained by the underlying transport. (CAN 2.1.2)

2.2 -serial services

APPLICATION: The following conditions apply to -serial services.

Each message MUST be sent without any extraneous white space. (MUST 2.2.1)

A provider MUST follow each message with the decimal string of byte wise XOR checksum. (MUST 2.2.2)

e.g.

{ … object … }234\r\n

Each message MUST be no longer than 251 bytes long (256 bytes with checksum and delimiter). (MUST 2.2.3)

The consumer CAN NOT assume that data integrity has been maintained (CAN NOT 2.2.4)

Before the object is processed the consumer SHOULD perform a byte wise XOR of the object including the curly brackets and confirm it against the checksum. (SHOULD 2.2.5)

2.3 Encoding

APPLICATION: The following conditions apply to all objects.

Objects MUST be sent as valid JSON. (MUST 2.3.1)

Objects must have UFT-8 encoding. (MUST 2.3.2)

Messages and hello objects MUST be followed by a carriage return + new line ( char(13) + char(10), \r\n ). (MUST 2.3.3)

3. Hello

APPLICATION: The following conditions apply when a duplex connection is capable of sending JSON objects in both directions as part of a single session,

The provider MUST wait until it receives a hello object from the consumer before it sends any messages. (MUST 3.1)

Hello objects MUST conform with the signalk_hello.json (MUST 3.2)

IF a consumer joins a stream mid flow it MAY send a hello object to the provider. (MAY 3.3)

IF a provider can not meet the version requirement in the hello it MUST close the connection. (MUST 3.4)

4. Source

deleted

5. Subscription

APPLICATION: The following conditions apply to services where there is a need to manage the range of keys that values are communicated for.

IF there is a need to restrict the messages a consumer receives MQTT SHOULD be use. (SHOULD 5.1)

6. Messages

APPLICATION: The following conditions apply when messages are sent between nodes.

Messages MUST be sent between nodes in a delta object that complies with the signalk_delta.json schema. (MUST 6.1)

The "type" property of the source object MUST have the string value "signalk-delta". (MUST 6.2)

The "device" property SHOULD have the device id as its string value. (SHOULD 6.3)

7. Intermediate transports

7.1 Serial connections (UART, RS232, RS422)

APPLICATION: The following conditions apply when messages are sent over a UART, TTL, RS232 or RS422 connection.

Serial connections SHOULD be at 115200 BAUD, 8 bits, 1 stop bit, no parity. (SHOULD 7.1.1)

The services available on a serial connection MUST be _serial services. (MUST 7.1.2)

7.2 MQTT

APPLICATION: The following conditions apply when messages are sent using MQTT.

The MQTT broker MUST announce its self as a "master" or "slave" role (MUST 7.2.1)

The device id of the provider MUST be sent as the MQTT client id (MUST 7.2.2)

Delta objects MUST be published with the following topic: A “signalk/vX/stream/”, where X is the version of Signal K. B Followed by the context if it is known, "vessels/self" if it is not. (MUST 7.2.4)

e.g.

signalk/v1/stream/vessels/self

7.3 Web Sockets

APPLICATION: The following conditions apply when services are made available over a Web Socket.

Web Socket services MUST be made available via URIs with the following format: "ws://hostname/signalk/vX/stream/" where X is the version of Signal K (MUST 7.3.2)

The Web Socket URI MAY be include a query giving the initial subscription status for the connection as follows: ?subscription=all - all the values available form the server are sent by when the connection is made ?subscription=none - no values are sent when the connection is made ?subscription=self - only values for vessels.self are sent when the connection is made (MAY 7.3.6)

Additional values MAY be requested by the client by sending the server a subscription object. (MAY 7.3.7)

Subscription objects MUST comply with the signal_subscribe.json schema. (MUST 7.3.8)

If the value of a key is available a server MUST send updates for the key if it has received it in subscription objects unless it has subsequently received it in an unsubscribe object. (MUST 7.3.9)

Updates MUST be sent by the server at the frequency specified by the minPeriod and policy properties in the subscription object. (MUST 7.3.10)

A client MAY ask the server to stop sending values by sending an unsubscribe object. (MAY 7.3.11)

Unsubscribe objects MUST comply with the signal_unsubscribe.json schema. (MUST 7.3.12)

A server MUST NOT send updates for the keys it has received in an unsubscribe object unless it subsequently receives the same key in a subscription. (MUST 7.3.13)

7.4 UDP

APPLICATION: The following conditions apply when messages are sent over UDP.

Each message MUST be sent in a single UDP packet. (MUST 7.4.1)

pod909 commented 7 years ago

signalk_stream_hello.json

OPTION 1

{
    "context":"manifest.achme-thing-12345",
    "version": "v1.1.2",
    "timestamp":"00/00/00T00:01:23",
    "sends": [
      "navigation.location.lattitude",
      "navigaton.location.longitude"
    ],
    "publishes": [
      "navigation.location.courseOverGround",
      "navigation.location.speedOverGround"
    ]
}

OPTION 2

{
  "achme-thing-12345": {
    "version": "v1.1.2",
    "timestamp":"00/00/00T00:01:23",
    "sends": [
      "navigation.location.lattitude",
      "navigaton.location.longitude"
    ],
    "publishes": [
      "navigation.location.courseOverGround",
      "navigation.location.speedOverGround"
    ]
  }
}
timmathews commented 7 years ago

I don't like the way this is going at all. Most of the options that have been broken out into separate services should be negotiated at connection time.

Frankly, I don't care at all about Signal K over serial connections. I can't think of a single good reason to do it, so I'm not going to comment on the checksumming or line endings to separate messages. That said, there is no reason to specify a method of sending Signal K data over an Ethernet connection (TCP or otherwise) which requires explicitly stripping whitespace, applying a checksum or adding a carriage return and linefeed. If the argument is that there are devices which send NMEA 0183 data that way today; well, that's nice but there aren't any Signal K devices which do and let's keep it that way.

Sections 1 - 4

As for all of the different ports proposed, we cannot do that. At some point, we will request a service name and port assignment from IANA for Signal K. There is no way that IANA will give us more than one as they expect protocols which have different encoding techniques (e.g. plaintext and TLS encrypted) to handle both forms on the same port via negotiation. RFC 6335 § 7.4 There are currently a lot of examples to the contrary (HTTP comes to mind), and there is some more flexibility in the "user ports" section that we'd be requesting from, but nonetheless we should strive for a single port. Therefore any recommendation for handling different content or transport types must include some form of negotiation. Simply put, regardless of whether a device is a consumer, producer or publisher (more on that in a moment), if they are listening for a connection, they're a server and should all listen on the same port. Someone suggested 8375 as the standard port for Signal K since it's the decimal ASCII codes for SK. I think that's a good idea and we should probably submit the request for it to IANA. But I want to have this hashed out first.

Obviously, if Signal K is being transmitted over WebSockets, then the recommended ports should be 80/443, and if it's over MQTT then 1883/8883. For raw Signal K we would use 8375 for both TCP and UDP, and regardless of whether or not the device expects clients to push data to it, read data from it or provide it with a list of subscriptions that it wishes to receive.

Also, by doing this we eliminate the difference between producers and publishers which is arbitrary and confusing anyway. When connecting to a Signal K server via WebSockets, you have three options available to you:

the default is ?subscribe=self. This is all documented in Streaming WebSocket API. Raw TCP transports should follow the same semantics. Of course, they don't have the luxury of HTTP as the negotiation mechanism, so I propose that upon connection to one of these services, the client sends the following as a subscription message:

{
  // The highest version of Signal K that the client can support
  "version": "1.2",
  // An option name for registration of the client with the server
  "name": "Nav Station iPad",
  // A unique ID for the device. This should be programmed at time of manufacture
  // Or for user-created devices or pure software products, as part of the setup process
  // A v4 UUID
  "id": "d6912027-ffc7-4a9f-abf5-091eed3ce0ee",
  // If the client pushes data to the server, a list of Signal K keys which it provides. This list
  // could be quite large, so wildcards are allowed. If the device is another Signal K server
  // or a gateway it might send ["*"].
  "provides": [],
  // A list of Signal K keys the client wishes to subscribe to. Wildcards are allowed. See the
  // documentation on subscriptions
  "subscribe": [],
  // A context which can be used for "provides" and "subscribe". Optional and assumed to be
  // vessels.self if missing
  "context": "vessels.self.environment",
  // Specifies how the client intends to communicate with the server. Options are "delta",
  // "sparse" or "full". Only very simple clients strictly sending data to the server should use
  // "full"
  "updateFormat": "delta"
}

This is essentially the client version of the already established server "hello" message.

The server should respond with the standard "hello" message:

{
  // The highest version of Signal K that it can provide which meets the clients request
  "version": "1.1",
  // Time the message was sent. Note this is not good enough to synchronize the client's time
  // to the server
  "timestamp": "2016-11-27T22:19:30.112Z",
  "self": "urn:mrn:signalk:uuid:c0d79335-4e25-4245-8892-54e8ccc8021d"
}

This handles all three possible cases: client strictly sends data to the server, client strictly receives data from the server and client both sends and receives data to and from the server. It also handles format negotiation, however it does expect that the Signal K server implements support for delta, sparse and full formats.

Section 5

I must confess I don't understand §5 at all. What are the JSON files being referred to? Do you mean delta.json? What is signalk_stream_value.json?

MUST 5.3 IF the key CAN NOT be sent using an intermediate transport each value object in a message MUST contain a path property

What does this mean?

Section 6

I think it is sufficient to state that timestamps must be in ISO 8601 format pursuant to RFC 3339 as stated elsewhere in the Signal K specification. It would be nice if all the devices always sent out a UTC timestamp, but as long as they include their offset, then that should be sufficient (converting that to UTC is trivial).

I'm not sure where you're going with §6.3, but if it implies that we use the timestamp received from the server to synchronize the clock on the client, that won't do. I agree that all the devices on a Signal K network should have their clocks synchronized, but this isn't the place for it. NTP is not super difficult to implement client side and should be our recommended time synchronization strategy.

Section 7

The MQTT section seems pretty solid, except for sending data when the context is unknown. Shouldn't self be assumed if the context is unknown?

tkurki commented 7 years ago

I've been in favor of a message delimiter \r\n to ease parsing and handling. I believe there is more value in specifyfing that than allowing just a stream of arbitrarily formatted JSON messages.

I believe with that we end up with less obscure bugs and easier adoption.

pod909 commented 7 years ago

Thanks for the detailed response @timmathews

A lot of the thinking here is based on support for low power, low capability devices and devices where its imperative to add the absolute minimum cost in terms of viability. For most use cases in this space you are looking for:

It makes a good node->browser or APP stream but at the device level Web Socket is a complete none starter. It's heavy and has zero support in the stacks supplied by IoT manufacturers.

The lowest common denominator for all hardware is a serial connection. The most you can rely on having is plane TCP. mDNS is becoming more common. More expensive modules will also provide HTTP, but an extra 10'er is A LOT of cost to add. Then add the cost of a WS stack on top. You might as well just implement n2k.

1/ Negotiation requires a 2 way communication, that adds complexity and there for cost. There needs to be a way that a sensor provider can find a consumer with out requiring mDNS. Equally there needs to be a way for consumer "SK Servers" to find a sensors stream with out the sensor having to respond to mDNS. The current doco uses recommended port numbers so I copied that approach. I actually thin recommended ip address ranges might be needed as well.

I'd prefer the message type to be sent in the hello but there is some resistance to sending a hello in all case - hence the explicit list of when a hello is necessary.

2/ Sumps has made us aware that, if you support serial (and we must), then there is a need to allow for the serial data to be passed over TCP with out any error checking in the gateway. My reaction was a bit like yours but acknowledging the requirement support for it needs 2 different kinds of service defined: _stream: which relies on the data integrity of TCP, does not require white space to be stripped, \r\n delimiters or checksums _serial: which takes the stream exactly as presents it and sends it over TCP. Because the original serial feed may be corrupt the delimiter and checksum are required. Because "TCP" does not automatically mean "unlimited bandwidth" managing white space is also a good thing.

... at least then a Server knows it has to do additional error checking or can just say "sod off"

3/ Happy to go with those ports

4/ Agree the difference between producer and publisher is arbitrary. It was just a way of saying there are some additional things a publisher has to do (be able to receive subscription requests for instance) that a plane provider does not. There needs to be a mechanism for how this would work on a plane TCP socket. Preventing the flooding of a server via a plane TCP connection is still an important requirement but those features should not apply when flooding is not a risk (values for 4 or less keys is a suggestion). Requiring hellos, handshaking, subscriptions for serial for instance is a hide into no where.

5/ Happy with you suggestion for the hello content but you're looking for bi directional communication. The only thing that needs to send a hello is the node that will provide messages. There's no need for the consumer to tell the provider anything (subscription requests excepted).

I'm not sure how what you proposes works for sensors that act as servers, waiting for a SK Server to connect to them as a client and pulls messages from them.

A client having to connect, a hello having to be produced in all cases (not backwards compatible with the current flow) read the hello, decides it doesn't like the message format and closes seems like quite a lot to be imposed vs including it in the mDNS service description so that none of that has to be gone through.

6/ Value is the same as Sparse with a full path as context. Only thoughts are by allowing fragments of the SK tree to be passed rather than just leaves it adds substantially to the processing power required. I understand from TK that the delta was invented for that reason. A stream of values is quite close to the delta but gets rid of the extra (and unnecessary) processing involved with the wrapper. Very useful when it comes to MQTT where it would be good to just update the topic and republish the value with out any change at all. There's no need really for wrapping up values in sparse JSON. You just subscribe to topic with a wild card.

Weight wise individual values vs the extra wrapping in the delta pretty much even out.

I'm keen on sending values for these reasons but I understand that delta is around and supported. That's just the breaks.

7/ If you are sending the individual values, or a delta with only one value over something like MQTT then the key is carried in the topic. There is no need to include a "path" property in each value object.

Plane TCP and WS are incapable of identifying the key outside of the message, there for each value object would have to contain a "path". This is how the delta works, the only extra detail is that this may not be needed in all cases.

8/ NTP isn't supported in most stacks. If it's not super easy the answer will be "forget about Signal K". Many onboard networks will not have a NTP server. Having RTC on board is extra hardware and requires a battery, which adds A LOT of hassle in terms of certification and customs clearance + creates an artificial life for the product... typically 4-5 years max unless you over spec the battery.

For sure UTC is the standard but there needs to be a mechanism that allows time synchronisaiton across nodes that do not know about UTC and puts the correction against UTC on the node that knows about UTC rather than relying on it begin passed back and forward

pod909 commented 7 years ago

Feedback from Tim and Teppo incorporated

timmathews commented 7 years ago

This is looking better.

Unfortunately, there is at least one company interested in producing USB-based Signal K sensors, so clearly we need to provide them with a solution for handling it (grumble, grumble).

In addition to providing a checksum method and an end of message marker, we also need to specify a maximum message length, which all devices implementing Signal K must be able to consume. If we don't do that, we will see strange interoperability bugs where device X sends a message to device Y which silently rejects it because the message was too long for device Y's chosen input buffer length. Or worse, device Y crashes.

I'd like to propose a more robust message format (similar to how Actisense wraps n2k bytes coming out of the NGT-1):

<STX><h><h><h><US><payload><US><c><c><ETX>

where

<STX>     = Start of Text (0x02) to indicate a new message
<h><h><h> = 3 hexadecimal characters, most significant nybble first indicating payload length (up
            to 4,095 bytes)
<US>      = Unit Separator (0x1F) to indicate end of payload
<payload> = Signal K message in JSON format
<US>      = Unit Separator (0x1F) to indicate end of payload
<c><c>    = 2 hexadecimal characters, most significant nybble first indicating checksum of
            payload
<ETX>     = End of Text (0x03) to indicate end of message

This only adds 5 bytes to the current proposal, allows quicker synchronization for listening devices, is explicit about number of bytes being transmitted and uses ASCII control characters for their intended purposes.

That's my $0.02 about Signal K over serial links, and here's hoping I never have to actually use it.

timmathews commented 7 years ago

I still firmly believe that there is no reason to provide two different ways to send Signal K over TCP in addition to WebSockets (which should only be used to communicate to browsers) and MQTT (or other messaging protocol). And since the start and end flags, message length, checksum and separators are completely redundant when using TCP, we shouldn't include them. If you want to send Signal K which was received from a serial device over TCP, strip that stuff off. It's really quite simple.

There are a few reasons that I suggest a hello message from the client to the server be required when the connection is established:

  1. It allows the client and server to agree on a version of Signal K at the start of communication. Then the version doesn't need to be included in every message.
  2. It allows the client to provide a unique ID to the server.
  3. It allows the client to subscribe to data.
  4. It allows the client to inform the server of what it is capable of sending.
  5. If you're debugging a Signal K endpoint via telnet, it's really inconvenient for it to start spewing data at you as soon as you connect. And if we don't believe that to be a valid use case, then why are we using a human-readable message format in the first place?

I'm not sure how what you proposes works for sensors that act as servers, waiting for a SK Server to connect to them as a client and pulls messages from them.

I don't think I understand this. If a sensor is a Signal K server, then it has to provide the semantics of a Signal K server and be able to push data. Pulling - to me - implies some kind of HTTP-esque GET request. And if that's the case, use HTTP.

I'd like to see an example of - or the schema for - signalk_stream_value.json. I don't understand your explanation of it and how it differs from the existing sparse format.

8/ NTP isn't supported in most stacks. If it's not super easy the answer will be "forget about Signal K". Many onboard networks will not have a NTP server. Having RTC on board is extra hardware and requires a battery, which adds A LOT of hassle in terms of certification and customs clearance + creates an artificial life for the product... typically 4-5 years max unless you over spec the battery.

As far as I'm concerned, if the device cannot (or the manufacturer will not) add an NTP client, then it must not generate a timestamp and the timestamp for the data must come from the client receiving the data.

At no point does supporting NTP require a battery backed RTC. In fact, most devices won't even need the RTC itself, periodic NTP queries and a local stable oscillator are good enough. Somewhere on the boat, we can be pretty confident there is a GPS. From that, the Signal K server (which is also an NTP server) can get the correct time and become a strata 2 NTP server. Having the GPS hardwired with access to the PPS signal would be nice, but not necessary. Worst case is a cold boot of the GPS system and no accurate time for a while.

For sure UTC is the standard but there needs to be a mechanism that allows time synchronisaiton across nodes that do not know about UTC and puts the correction against UTC on the node that knows about UTC rather than relying on it begin passed back and forward

We require that timestamps are in ISO 8601 format. There is no version of that format where time can be included without a timezone. There is no requirement to send timestamps in UTC, offsets are perfectly fine. Simply put, a device that knows the local time, but doesn't know its timezone offset, CANNOT generate a valid timestamp.


Bottom line we can either build a 21st century, Ethernet-centric system with all that entails; or we can build a new version of NMEA 0183 with JSON and waste a lot of bytes in the process. I do not prefer the second option.

pod909 commented 7 years ago

My take is that the job isn't to dictate. At this stage an approach that facilitates, sees where things go and brings people a consensus when needed may better serve adoption. That's how SK has been successful to date.

Totally agree that Ethernet and bi directional coms should be the standard aspired to but when you're talking sensor hardware ever single 1c, mAh and 1h spent on development and testing counts. In a low volume industry like marine even more so. Anyone asking for NTP, WS, Ethernet, etc. can not be surprised to end up paying $$$$$$ for a basic GPS sensor. That's entirely the problem with n2k that SK offers a potential way out of.

Nothing that's suggested for plane TCP contradicts that. The proposal looks at the artifacts in SK and then uses them adding the absolute minimum required.

Could we do serial better with more framing? Sure. Should we? Keep it as close to the core SK objects as we can would be my thinking.

... There are a couple of problems with the WS approach to date where there has been a reach for custom solutions.

Subscription is one. For instance there's no real need for subscription. Individual values could be "subscribed" to using the URI e.g. ws://host/signalk/v1/stream/vessels/self/navigation/position.

Would that result in multiple sockets? It would. I'd argue that's an inherent limitation of WS and MQTT (including MQTT over WS!) would be a better choice for those situations. Instead of adding to SK show how MQTT should be used.

A hello has been added to carry the SK version when the version is already present in the URI the client requested.

Here's a question: a sensor finds 2 servers, they send 2 hellos with 2 different vessel contexts. What context should the sensor include in the delta?

I'm going to remove the subscription from the RFC and say "if flooding is an issue don't use a plane TCP or WS. Use MQTT instead" and see what that looks like.

...

NTP is extra work. Extra processing. Extra costs. It can not be mandated.

Don't mind removing the none synchronized time stamp so long as there is some way for the sensor to indicate the age of the value as it is transmitted. For instance the nature of the hardware in paddle wheel transducer or MHU means that the value is already 0.5 - 1.25 seconds old be the time it's transmitted. Knowing that kind of detail would make a fundamental difference to, for instance, calculating true wind.

Something like:

"T2.5S": {
  "mean": 13.245
}

rather than just:


{
  "value": 13.245
}

(see TKs issue on time bases)
pod909 commented 7 years ago

value would just be the value object from sparse with no additional wrapping.

{
  "value": 1.345
}

and not:

{
  "vessels": {
     "123445": {
        "navigation": {
            "courseOVerGround": {
                "value": 1.345
}}}}}

I want to put forward the idea that atomic values should be sent with none of the aggregation "sparse" may imply.

pod909 commented 7 years ago

json wise the: value and delta are already defined hello should be what ever emerges from the discussion on devices i.e. it should be the a device object that can be inserted directly into the "manifest" subscription has been removed

sumps commented 7 years ago

As a general comment to these discussions and others I am seeing, I don't think we should mandate a method of doing things in SK, based on one particular application - in this case instrument readings for high performance racing yachts.

Make it possible for this application to be implemented i.e. have Time Stamps, method for devices to sync time (not sure Hello is best for this), but don't mandate it or make it the preferred method. This just complicates things for the 80% of normal applications with less critical requirements.

In reality for pod909 application it just needs the transducer and server to be in sync to remove any latency, which I would not actually expect to be significant, assuming the right transport mechanism was used. Therefore talk of UTC, Local time offsets and NTP is not really applicable for this application.

So in summary, my advice would be, make things as simple and easy as possible for 80% of applications and support the more complex applications but place the additional implementation effort with these "niche" applications.

pod909 commented 7 years ago

In fairness nothing was mandated. It was given as an option. Eithere way I've taken it out on the grounds that there isn't agreement

pod909 commented 7 years ago

Bringing together the current thinking on the hello we get:

{
  // a unique id for the device given to it by the manufacturer
  "achme-thing-12345": {
    // link to a schema that defines a default set-up for the device in manifest
    "$definition": "http://www.achme.com/devices/thing/6.3/thing6_3def.json",
    // link to a schema that defines the data the device provides: may be _delta or _values to signify that a standard SK format is used.
    "$schema": "http://www.achme.com/devices/thing/6.3/thing6_3data.json",
    // optional array listing the keys that the device provides in a standard format
    "provides": [
        ... key list with wild cards ...
      ],
     // the context for the messages sent by the device, if known
      "context":"vessels.self"
    }
}
timmathews commented 7 years ago

Yes, yes, yes (with regards to latest by @pod909)!

Provided of course that the URL in $definition and $schema are accessible not on the internet (i.e. hosted by the device). Well, actually it could even be on the internet for very simple devices, but the setup procedure would potentially require internet access and that complication may outweigh the benefit of having the cheapest possible MCU in the device.

If we do this, then there really isn't any need to have a super-prescriptive all-encompassing schema that has to be maintained across a billion devices and we can solve the issue of compatibility once and for all. In fact, we will no longer need be concerned with what version of Signal K is supported by each device (or rather core schema changes aren't breaking changes any longer).

I actually started writing up something along these lines last night because I was feeling frustrated that we were losing sight of a core goal of Signal K - universal plug-n-play compatibility, but in reality it looks like @pod909 and I just were not quite on the same wavelength.

BTW, ISO 8601 specifies an interval format, so a 0.5 second offset would be "timestamp": "P0.5S". I really should have said that a device which doesn't know it's own timezone cannot generate a valid date-time timestamp. It absolutely can generate an interval timestamp.

Do any existing instruments convey their measurement delay to their respective processors or are the delays just generally known and automatically accounted for? There's no such field in the n2k PGN, so you make me concerned that the reading I'm getting from my WSO100 is out of date when I get it. I totally see the value in it (and it could be a nice argument in favor of Signal K if we provide such a thing), I'm just curious if it exists in the market today.

timmathews commented 7 years ago

One other thing. TCP/IP and protocols built on top of it, is probably not the right solution for something as sensitive to timing as a high-performance sailing instrument system. The delivery guarantees made by TCP are at odds with the goal of getting the data from a sensor as quickly and with the lowest latency possible. Even n2k is not a good solution because of its shared bus nature.

If low latency is your thing, aren't you better off connecting all your paddlewheels, masthead units, etc. directly to the processor (via 20mA current loops or whatever) and having it output Signal K for consumption by displays?

sumps commented 7 years ago

Not sure I can allow a 20mA current loop in my 21st century electronics system 😉

Seriously though I need some more info (example) on this latest Hello definition from @pod909 before I can comment as I am struggling to get my head around how this would work in practice.

pod909 commented 7 years ago

@Ah lads, having removed the f'ing thing you're now back tracking on me!!!!!!! True wind and auto helm (this is mostly about performance cruising) calculations are only as good as the slowest signal. Wind and speed are slow by the nature of the hardware used to measure them. @sumps is right in that you sync the sensor and processor ... and the sensor reading with each other ... by having an intimate knowledge of how the signal is being generated and filtered that are being applied. It would be nice if that knowledge was imparted by the manufacturer along with the signal.

That said lets park this for another day and get this RFC over the line.

pod909 commented 7 years ago

The description and schema basically form the device drivers for the sensor. No doubt it should be possible to cache them locally and include them in a distro. If they are not in the distro then they would need to be downloaded or manually copied into the cache.

To avoid disappointment, as well as standard schema, we should probably have a default description.

pod909 commented 7 years ago

Step by step

1/ The SK Server (a consume) and a sensor (a provider) connect

2/ The sensor sends the hello message. This is inserted into the manifest and the follow on readings are recorded in it straight away.

3/ The SK Server uses the "required" parts of the description schema to flesh out the object for the sensor in the manifest.

That might include a default context relationship such as "vessels/self" ... with RFC0001+6 that could be "vessels/self/equipment/unknownTypeOfThing-12345"

4/ The continues taking readings from the sensor, recording them/the latest in the manifest and propagating them to the default context(s). I've called that "$schema" as to make things easier for RFC0006 to follow on but for now there need to be delta and value defaults.

5/ The installer goes to the SK Server and sees that there is now unknown equipment. They change it's context (name), context mapping in the manifest _meta so that it makes more sense to them in terms of their own boat.

6/ That's it

jboynes commented 7 years ago

@pod909 With you on the first 4 steps. I'm not quite clear on what you mean by "delta and value defaults" - isn't it enough just to record the value received (skipping over the history/time series issue for now)?

For 5), I don't think we should change the name but instead attach a human readable label to it. So from a technical perspective, the systems continue to identify it as "unknownTypeOfThing-12345" but the label lets the user know it's "Master Cabin Temperature Sensor" or whatever.

Some of the mappings will be identity operations, basically just changing context, but others will be more complex calculations e.g. applying a calibration curve, converting magnetic to true based on location, etc. The mapping would need to include which calculation to apply (or in more sophisticated servers allow user-supplied calculations).

I don't think we are quite done at 6). We still need to define a way for UIs to discover the data that's available and display it. That's going to require a change to the REST interface as clients will no longer be able to rely on well-known keys or URI paths. We would want to allow something like:

a) User adds voltmeter widget to instrument panel b) Client queries REST API for voltages that are available c) Client presents list to user using human-friendly labels d) User selects which one to display e) Client saves widget configuration using value's id

When displaying the panel, client polls for or subscribes to value updates based on the id in widget config.

Then we're done :)

tkurki commented 7 years ago

I don't see this converging any time soon.

So as part of Streaming values over a range of transports you are redefining also subscriptions - and pretty much everything else????

@pod909 the subscription protocol is not just by path subscriptions, it allows throttling - a real use case for remote acces over a data plan.

I see several extentions to the current subscriptions: source handling, aggregation, composing subscriptions (send alerts always right away, aggregate everything else). None of this will fit into to the path nicely. As I've said before: a single path based model will force you to cram everything to the path and you'll be in trouble.

Why on earth would I open a multitude of connections to achieve less?

tkurki commented 7 years ago

there really isn't any need to have a super-prescriptive all-encompassing schema

The point of the schema is that it allows a shared semantic model. If the sensors can provide their own definitions of what they provide, something further along the processing chain will not know what that value means.

Let's use the lateral leeway measurement as an example. If there is no shared schema how the heck does my fancy visualisation widget, crafted for this very data, know that your sensor is outputting that, if the vocabulary you are using to describe what the data actually means is different from mine? No amount of self describing zero configuration schema magic will solve that.

Same goes for non trivial temperatures. My engine diagnostics needs exhaust temperature. How am I to know that your temperature is exhaust temperature? How is the suggested mechanism better than simply configuring that the reading is exhaustTemperature for engine 2?

Somewhere along the line from the sensing to where the data is used something needs to be configured to tell what the data is about. There is no zero configuration. So wouldn't it be much simpler to either

That is all you have to do with the current schema. All my little program that reads an ADC via I2C (essentially a dumb sensor, could be on an ESP8266, discovering SK server and starting to push minimal deltas over TCP) has to know is that it is outputting electrical.batteries.house.voltageand electrical.batteries.house.current. No extra schemas, no metadata. Simple to adopt, simple to understand, simple to debug.

Sure you can label the figures for humans, but that is not the point. And btw the current schema has stuff like displayName, longName, shortName and gaugeType. Maybe we will have better applications by defining a new way to do this, instead of writing some software that uses what we have already defined?

And I am not saying there is room for improvement in the current SK. Why reinvent everything? If you want to redo everything why not start from scratch?

And here I go, talking about stuff that has very little relation to this issue.

tkurki commented 7 years ago

Would it better serve the greater good to continue with extensible schemas at https://github.com/SignalK/specification/issues/206 ?

pod909 commented 7 years ago

The current draft includes the current streaming implementation in full bar subscriptions

All we're doing at the moment is taking a step back from a custom subscription model and seeing if it can be left to MQTT.

Throttling via MQTT is proposed. 1 single connection.

Extending the RFC to cover aggregation, notifications, source (what's the point of the current source implementation anyway?) is trivial. How they are generated by a server? For discussion under another RFC.

tkurki commented 7 years ago

By throttling I mean rate limiting: give me updates every X seconds. I am not aware of a mechanism for that in MQTT. QoS is about delivery guarantees.

pod909 commented 7 years ago

If the provider sends messages with different QoS for different frequency messages then the client can choose to only subscribe to the lower frequency ones.

A individual client or server can slow down the PUBACK or you can throttle the coms speed so that you don't draw down more data than the client can handle and then the broker works out what to do with the back log.

For a really slow trickle the client would just disconnect between receipt attempts.

Controlling the rate of PUBACK seems to be the main method used. And you have granular control over what you subscribe to.

tkurki commented 7 years ago

It doesn't work like that:

Note that a Sender is permitted to send further PUBLISH Packets with different Packet Identifiers while it is waiting to receive acknowledgements.

pod909 commented 7 years ago

http://activemq.apache.org/producer-flow-control.html

Or we find another protocol and add it to the list of recommendations with guidance on how to use it in a standard way for SK.

pod909 commented 7 years ago

Minimal hello...

{
  "achme-thing-1234": {
    "$shema":"_delta",
    "provides": [ ... list of keys and partial keys ... ]
  }
}
pod909 commented 7 years ago

Update to make it fully backwards compliant:

Changed the provider hello to source to align with the current specification and avoid any confusion.

pod909 commented 7 years ago

Inclusion of notifications. Further alignment with existing standard.

pod909 commented 7 years ago

Alignment of the hello across transports

pod909 commented 7 years ago

Handling of default context added context$ and source$ ref added to UDP for atomic messages as they may arrive before the source object. Further tidy up of context and source for mux'ed signals. Path in a leaf optional if it can be identified from the source

pod909 commented 7 years ago

Examples added

pod909 commented 7 years ago

Scope of CR reduced to summarizing the existing delta stream in a formal manner and adding additional transports for delta only. With a review and edit his RFC may be taken as the streaming section of a v1 standard.

This doesn't get anywhere close to taking advantage of the potential for using MQTT and an off the shelf broker in place of the signal k server. I'll raise those elements of this RFC is a separate exploratory RFC.

rob42 commented 7 years ago

Something we need in hello messages, or at least somewhere, is the ability to send userid and pass token.

I am playing with a security implementation, and can now control fine-grain access nicely, but except for http/ws we cant send userid or token in a std way. Adding them to the hello message might work?

jboynes commented 7 years ago

Given the number of options, security warrants a separate RFC, probably more than one (e.g. authn vs. authz, non-repudiation for control operations, etc.).

rob42 commented 7 years ago

Yes, on second thoughts lets keep this RFC focused on its own goals

toqduj commented 7 years ago

Hi all, sorry for missing this interesting discussion for the last month or so. I think it looks very good so far, just a small set of comments to add to those already here. MUST 2.3.3: I agree with others here that a newline and carriage return may make sense for serial, but not for the other delivery methods. Valid JSON already means the message is complete. MUST 7.2.4: Why is the version number required in the path? To avoid issues with non-backwards-compatible SK definition changes?

As for the example by @pod909 here, can someone explain what the difference is between "sends" and "publishes"?

As for delays in sensors, I agree with @timmathews in his comment here, that a value age (mean delay) can be represented by the ISO 8601 "P"-style notification.

As for @rob42 his comments on authentication, MQTT uses a username/password authentication scheme. Would that not be sufficient for now?

Cheers,

B.

toqduj commented 7 years ago

Update on the current implementation of a sensor sending SignalK deltas can be found here ( https://github.com/sailoog/openplotter/issues/154 ). Values are being received but are not yet ... "internalised". I'm sending deltas without a hello. At the moment, there is no "device" field in my JSON, and the "timestamp" may not be in the right place (I think I've seen it in the "source" section and separate as well in the SK documentation).

Anyway... progress.

toqduj commented 7 years ago

The two modules consisting of the WiFi GPS sensors and WiFi environmental sensor bank are working well. They run stably for weeks (hasn't crashed yet). Sailoog's OpenPlotter software ingests and internalises the incoming data. I'd say this approach works!