homieiot / convention

🏑 The Homie Convention: a lightweight MQTT convention for the IoT
https://homieiot.github.io/
Other
716 stars 61 forks source link

Some thoughts beyond v3.0 #62

Closed zephyrr closed 6 years ago

zephyrr commented 6 years ago

1 Charset Should probably note that all messages are text (not binary data, which could be sent via MQTT), and note the character set. The ID and examples suggest using ASCII, with one exception.

The exception: units uses the degree mark in the table, but not in the example (plain "C" is used for degrees C in example). Should be consistent and unambiguous.

I kinda like not having to parse UTF-8 (or other extended character sets) on a tiny machine, so "degC", "degF" would work better for me.

2. Angle unit? There is also a simple degree unit defined but for what purpose? If the idea is to convey angles, how about "degA" to make it less ambiguous?

3. Missing major units? Units doesn't have time (sec) or weight/mass (lb/kg)... Flow (eg: gal/hour) is very useful for home automation

4. Protocol version homie/$homie seems odd; why not be more clear and use homie/$version instead?

5. Small tweaks of wording Slight rephrase on required aspect of $checksum: "No, depending of your implementation" => "No, depends on your implementation",

6. Meaning of $fw/checksum Also, do you have in mind the checksum as computed before flashing by the toolchain and somehow embedded, or a checksum calculated by the firmware itself on it's own flash memory?

7. Better name in example In device attributes example:

homie/686f6d6965/$fw/name β†’ 1.0.0 homie/686f6d6965/$fw/version β†’ 1.0.0

having a real name might make better sense, like "temphum.ino"

8. $property ranges

Nodes the device exposes, with format id separated by a , if there are multiple nodes. For ranges, define the range after the id, within [] and separated by a -.

The ranges need to be better defined. The first thing is that I initially thought it meant something like: 0,3,[5-7],10 as shorthand for 0,3,5,6,7,10 but a much later example clarified:

homie/ledstrip-device/ledstrip/$properties β†’ led[1-3] homie/ledstrip-device/ledstrip/led_1/$name β†’ Red LEDs homie/ledstrip-device/ledstrip/led_2/$name β†’ Green LEDs homie/ledstrip-device/ledstrip/led_3/$name β†’ Blue LEDs

So it's meant to generate/define multiple strings from a pattern. In this case, it needs to be defined a bit better, like "[start#-end#] where start# and end# are non-negative integers with end# >= start#, and identifiers will be generated by iterating from start# to end#". So it would not work for, say, "base[01-15]" to describe base01, base02, ... base15 but only for base1, base2, base15, right?

This seems like it's kinda conflating the concept of naming and the concept of [numerically indexed] arrays.

In your example, I actually prefer enumerating "led_red,led_green,led_blue" to "led[1-3]" meaning led_1, led_2, led_3. The latter is a few character shorter, but the attributes of each needs to be described anyway so that's hardly a large overhead.

What if you get "led[0-149]" (eg: a strip of 50 RGB leds)? Is the controller supposed to expand that into 150 id's and deal with it meaningfully?

I'm thinking that there should be a small and enumerable number of property id's. If you want an array of Device-definable size, add that functionality explicitly rather than sort of handling it in the $property id's.

9. Naming.
In your examples:

homie/686f6d6965/$name β†’ Bedroom temperature sensor homie/686f6d6965/$nodes β†’ temperature,humidity # (I added this to the examples) homie/686f6d6965/temperature/$name β†’ Bedroom Temperature Node homie/686f6d6965/temperature/degrees/$name β†’ Bedroom Temperature homie/686f6d6965/humidity/$name β†’ Bedroom Humidity Node homie/686f6d6965/humidity/percentage/$name β†’ Bedroom Humidity

This redundantly describes things at every level, which may or may not be appropriate. We get

homie/686f6d6965/humidity/percentage/ mapping to homie/"Bedroom temperature sensor"/"Bedroom Humidity Node"/"Bedroom Humidity"

I wonder if a more hierarchical decomposition with less redundancy might work better?

homie/686f6d6965/$name β†’ Bedroom sensor 1 homie/686f6d6965/temperature/$name β†’ Environmental homie/686f6d6965/temperature/degrees/$name β†’ Temperature homie/686f6d6965/humidity/percentage/$name β†’ Humidity

This uses hierarhical context:

homie/"Bedroom sensor 1"/"Environmental"/"Humidity"

One effect of this is that one could replicate the node definitions for a standard sensor device:

"Kitchen sensor 2"/"Environmental"/"Humidity" "Living room sensor 1"/"Environmental"/"Humidity"

where the latter two levels were the same as for the bedroom; only the device name needs to change between the otherwise identical devices. And if you move a physical device to another room, all you have to change is the device name; you don't have to chase down all the substructure names.

This may be just a suggested change in the examples, suggesting to the user that there IS hierarchical structure or context available when a Controller is understanding a device, so there's no need to treat each property as standalone.

10 formats This is somewhat complicated to parse. You could skip ahead and check: if it has a colon then it's a range, if it has a comma then it's an enum, and if it starts with "regex:" then it's a regex. Or you could check the datatype and parse the format accordingly. I think it would be good to think more on the purpose of the $format and how it would be used. Is it to tell the Controller which formats should be considered errors for a read? Or is it to tell the Controller which formats the Device will accept so the Controller can detect errors on intended writes? In the case of a range or an enum, it could be used to populate choices for a user interface, but for regex's that's kinda ungainly.

If you are allowing regex's then you need to define which kind of regex's are allowed (perl? python? php has two kinds. etc) because they come in flavors. And parsing a regex is going to be hard on the Device end (maybe the Controller end too), as the libraries are large. And they can be notoriously hard for humans to read, create and avoid mistakes.

I kinda suggest just dropping the regex's as too much complexity for too little gain. The simpler ranges and enums are much easier and more valuable. However I might suggest defining ranges with a comma rather than a colon: "min#,max#" just to keep things simple - assuming that you will be looking at the $datatype to interpret the comonents of the $format, rather than parsing the $format itself; in some cases this allows more code reuse.

11. Comma Escaping in enumeration $format Should give example of comma escaping for enum $format (phrasing of eg in description sounds as if it is an example of that, but is not). Is the idea that an enum value of comma as the second enumerated valuse is like this: "first,,,,third" (first & fourth comma = separator, middle two comma = embedded comma)? Or 3 commas for first or last value? An escape like \ prefix is easier to understand and to parse, like "first,\,,third"

(If one even really wants a comma as an enum value; maybe drop the concept of comma as an enum value and thus the need for escaping?)

12. Consistant set/read state names Towards the end I read:

homie/kitchen-light/light/power/set ← on homie/kitchen-light/light/power β†’ true

Why on vs true? Seems like an unnecessary confusion. Pick one or the other and stick with it. Either use a boolean set AND read as "true" or "false", or use an enum (on, off) and
use that for setting and reading.

We could consider the boolean case as just a predefined enum of (true, false).

13. Special ID's and heirarchy

Note for things like $stats and $fw we see:

$fw/name $fw/version $fw/checksum

In these cases, would it make more sense to use an underscore instead?

$fw_name $fw_version $fw_checksum

That is, do you really want or need to use an extra level of MQTT hierarchy for these?

If so, give an example to help people understand how it's useful to be organized that way. And consider being more consistent by using $ prefixed id's for the substructure too, since these are predefined magic names rather than arbitrary user chosen ones:

$fw/$name $fw/$version $fw/$checksum

(By contrast, "alert" in the following is NOT a predefined magic name, but is user chosen, so it's not implied that all levels after the first $ prefixed id are also magic predefined names)

homie/$broadcast/alert ← Intruder detected

ajxn commented 6 years ago

Unit, why not use all metric, and then let the client programs change them to any unit the user prefers. It is just more complicated to have to handle different units. Esp. on small devices. If NASA can, so do we. :-)

zephyrr commented 6 years ago

Re: ajxn's suggestion of using all metric

Keeping the units all metric would simplify the Controller, as he suggests. My only hesitation is that converting units might need the Device to pull in a library for floating point math and conversion to strings, which it might otherwise not need. However let's assume that's OK and that devices dealing with this protocol (conventions) can handle the extra space. (In other cases, sensor values need some conversion anyway).

If there is a switch to all metric, let's have some description about significant digits and preserving enough accuracy without excess. So for example if converting from a sensor reporting in degF a value of 95.2 degrees, that should be converted to degC with enough precision that it will round back to the same value (95.2) if converted back to degF for display - but does not need 15 digits of accuracy in degC if there were only 3 digits of precision in the measured degF. In this example, 35.11 degC is fine, no need to report 35.111111111 degC.

I see the purpose of reporting units as two-fold: (1) if the unit is recognized by the Controller the value can potentially be interpreted or converted as needed; (2) whether recognized or not, the unit text could be displayed verbatim along with the reading for human interpretation. Predefining a set of units supports the first case, limiting the number of units the Controller should be able to handle, and avoiding a problem with the Device reporting in unrecognized units (or spelling of units) despite there being workable recognized units.

Keeping the number of predefined & recognized units relatively small simplifies the Controller. Using only metric does that.

timpur commented 6 years ago

Sorry will look at soon, when back from hols.... (next couple of days)

ingoogni commented 6 years ago

regarding 3. It's quite a task to list all possible units. One could drop the whole list and say something like only official SI units. IMHO sensors and actuator devices should only work with SI units, if a user wants one of the oddities have a (dedicated)client somewhere in the chain to convert it.

Regarding precision and digits, the SI gives the possibility to expres in different resolutions so you can 'push' a float into an int. Microseconds being an obvious one, but microdegrees_C is also valid, just as kΒ°C although they are seldom used. Another way is to keep the normal unit and also present a resolution. The Dutch weather institute, for example, presents temperatures in their datasets as 178 in 0,1 Β°C

regarding 9. One of the reasons for proposing that Homie allows/implements semantic topics in stead of device oriented ones. See issue #57

regarding 10. agree.

timpur commented 6 years ago

Can i confirm that this is about the redesign branch (3.0) ? as 2.1 is not a release candidate and soon will be replaced by 3.0 (Yes clean up of branched is needed and noted!!)

timpur commented 6 years ago

@marvinroger @ingoogni @euphi @zephyrr @ajxn Why can we set a starting range? This kinda seems pointless to me and should be fixed?

stefan-muc commented 6 years ago

Regarding 1 and 2: Units representation: It might be interesting to have a look at nholthaus units project - he had the same problem of having to code an ASCII representation of many units. See here and all following UNIT_ADD macros: https://github.com/nholthaus/units/blob/master/include/units.h#L3245 I think it would be best to be consistent to his convention, because it's quite complete and non-ambigous.

I am also not keen with making UTF-8 mandatory as MQTT is designed for low level communication with devices without much power. Making it mandatory also could introduce many bugs of not fully compliant UTF-8 librarys, ... not even Apple can do UTF-8 without bugs, how should a tiny microcontroller be able to do it?

ingoogni commented 6 years ago

The MQTT spec has utf-8 all over it, in many places it is required.

timpur commented 6 years ago

Regarding ranges, if you can set a starting range and and ending, should we allow users to have a prop that goes from eg 1 to 5 and 6 to 10 in terms of a range ? This would allow them to have some lights controlled in different ways bassed of the index? Otherwise wise I'm thinking that specifying the start, adds unneeded complex of the spec for no real gain. Should we make more use of this or remove ? Or am I missing something?

Also we should talk about stats being implemented as a node. Think this makes more sense and makes it easier to discover all the stats what might change from implementation to implementation.

euphi commented 6 years ago
davidgraeff commented 6 years ago

Are there any changes to homie 3.0 expected in the next days or is it safe to implement in the current state?

timpur commented 6 years ago

There are some changes. I'll try get my act together and start finalise things. Think stats implement as a node is the biggest change.

Everyone here, let's talk.

lorenwest commented 6 years ago

Hey @davidgraeff what language/device are you planning on implementing?

davidgraeff commented 6 years ago

Java, Eclipse Smarthome

ThomDietrich commented 6 years ago

Great to hear you are there now, David. Homie v3.0.0 is safe to be implemented as the most important aspects for autodiscovery are covered by the convention in it's current version. I had a feeling you'd soon get to this point and helped getting it ready.

timpur commented 6 years ago

@ThomDietrich, what do you mean by "...and helped getting it ready." ?

ThomDietrich commented 6 years ago

Not sure what you are asking. Bringing Homie to a state where it is fit to e.g. be supported by something like ESH, autodiscoverable most importantly. The emphasize is of course on helped as many are to thank for the recent advancements.

timpur commented 6 years ago

ESH, isn't the only platform out there, and homie shouldn't cater to only one of these autodiscovery platforms, it should stand on its own and not have bias. We should be implementing things that improve homie over all and not just for the sake to support autodiscovery for one product. Shaping homie to fit a platform like ESH is wrong IMHO.

I hope this is shared throughout the community.

zephyrr commented 6 years ago

I would like to raise attention to point #13 again.

Why is there a $ in front of some levels of the MQTT heirarchy but not others?

One possible convention: Use $xxx iff $xxx is a magic name, a fixed and constant string defined in the Homie Spec; omit $ if the user can configure the string or name at that level. Thus we would expect $fw/$version if both strings are fixed. If a level is a user chosen identifier, do not use a $.

Another possible convention: Use $ to indicate that this level and all sublevels are fixed and constant strings defined in the standard. So "/$fw/version" implies that "$fw" and the later "version" are both fixed and cannot be chosen by the user. Implication: No user chosen identifiers would be allowed to the right of any $ prefixed level.

Suggestion: pick one of these (or another); describe that convention in the standard; and then use it consistently.

ThomDietrich commented 6 years ago

Please do not mix issues. @zephyrr @davidgraeff I think both your requests would benefit from their own discussion threads, i.e. issues. Edit: @zephyrr, sorry misinterpreted your #13 mentioning. I'll try answering in my next message.

@timpur not sure where you got that idea from. Nothing here is specific to ESH. Check the "Motivation" chapter of our convention. Autodiscoverability (being able to understand all data- and actionpoints of one client based on the published information and a conceptional knowledge of the convention) is one of the most important properties/goals of the convention.

@davidgraeff I'm sure you are aware of this: The convention will evolve over time but clients will stick to certain convention versions. How do you imagine supporting a wide range of versions?

ThomDietrich commented 6 years ago

@zephyrr I personally did not find the time to address all your comments yet. Imho it would be better to break them down to multiple issues so we can focus the discussion.

Number 13 of your list is imho looking at the topic differently than the convention:

Attributes are represented by topic identifier starting with $. The precise definition of attributes is important for the automatic discovery of devices following the Homie convention.

Attributes represent metadata on top of the application-specific data and data-tree. Where e.g. homie/super-car/engine/temperature represents the temperature of the car engine, the topic homie/super-car/engine/temperature/$unit informs about the unit of this datapoint. Attributes are not necessarily about being constant, even though that is implied by their nature.

davidgraeff commented 6 years ago

"How do you imagine supporting a wide range of versions?"

I don't. MQTT wouldn't be a success story, if the protocol changed every now and then. It only got extended over time. (Although a real break is to be expected with MQTTv5).

The same goes with homie. I will implement whatever is current at that moment. I will add a check to require at least version 3.x up to but not including 4.0. So please stay compatible in this version, according to semver. If you are going to redefine the stat node, do that now ;)

ThomDietrich commented 6 years ago

@davidgraeff not sure how I feel about my understanding of your answer. If Homie v3.1 adds a new set of datapoints the Binding should be able to interpret and present them. Ideally this should happen based on the $homie version info rather than on best effort.

davidgraeff commented 6 years ago

@ThomDietrich I expect some homie implementations to skip certain topics ($fw,$implementation,etc) and I'm also expecting the spec to grow some more topics. That's not an issue at all for the implementation that I'm realizing right now.

It will break, if you change topic hierarchies or key topics. And that's what I mean.

Btw, I'm struggling with node instances (arrays) and may leave them out for now. It is not obvious for me, if node attributes are repeated for each instance. And if it is an array, what consists the not indexed node of. Is it only the attributes or can it have a value as well? Example:

homie/mydevice/node
homie/mydevice/node/$name="name"
homie/mydevice/node_0/$name="name0"

Is this a valid thing: home/mydevice/node and home/mydevice/node/set?

ThomDietrich commented 6 years ago

Let's continue the ESH implementation discussion in https://github.com/homieiot/convention/issues/85

davidgraeff commented 6 years ago

@marvinroger, @ThomDietrich : Should those non specific Issues be closed?

marvinroger commented 6 years ago

I think so. Umbrella issues are hard to track and discuss in a structured way.

davidgraeff commented 6 years ago

Closing this then. I invite every participant to create a concrete new issue for a non discussed topic.