Can "size" be mandatory?

wz2b commented 3 years ago

I'm trying to do the safest possible automated processing of these json files. The encoded length (in registers) was not a required attribute in the XML schema, but it was usually there. It's not a required attribute of a "point" object in the XML schema, either.

However, type is fairly restricted:

                "enum": ["int16", "int32", "int64", "raw16", "uint16", "uint32", "uint64" ,"acc16", "acc32",
                         "acc64", "bitfield16", "bitfield32", "bitfield64", "enum16", "enum32", "float32",
                         "float64", "string", "sf", "pad", "ipaddr", "ipv6addr", "eui48", "sunssf", "count"]

From those types one can build a map of "sunspec type string" to "encoded word length". It's pretty obvious what that mapping is for ones that end in 16, 32, or 64. For String there's a "size" and for the others I just had to figure it out by looking at the XML schema. For some of the other ones you can figure it out; for ones like 'sunssf' and "count" you have to look elsewhere. I think from looking at examples in smxd that those are single register (16 bit). String is variable length, so it has to be present (but I don't think you can make size mandatory for just one enum without some kind of schema features I don't think json schema has. Currently I think "pad" has to be a size of 1 register, but making size mandatory woudl allow 'pad' to be any size (if for some strange reason somebody needed 64 bit alignment, for example).

For the sake of automated processing, I would like to suggest that the encoded length be made a mandatory element of the schema. This would make it so that software authors don't have to keep their own map of type-name-to-encoded-length, and it would also be a hedge against the addition of future types - you could add a type "xyzzy" and if my software didn't know what that was it would be able to step over it without harm and without disrupting processing on the remainder of the message.

After discussion with @bobfox I'm fine with this project getting away from .xsd in lieu of the json schema being authoritative. I do think, though, that it is worth some continued tightening up of the JSON schema.

Thoughts?

bobfox commented 3 years ago

The rational for the current approach was to not duplicate information and avoid the errors that can lead to. Having said that, I certainly get your point. I have worked with routines (as recently as today) that need the size of each element and it is potentially a multi-step process to figure it out. I have resisted this change but I concede it may be the best way to go when everything is considered. If there is no opposition, I am willing to support this change.

wz2b commented 3 years ago

yeah, I understand your concern about normalizing out redundancy. I have the same preferential bent. In terms of having long-lived software, I just want to be able to handle forward compatibility similar to the way protobufs do it - even if I don't know the type of this item in front of me, it tells me how many bytes long it is so I can just skip it and not destroy everything beyond.

If there was some other way - like a separate map of sizes with the key being the sunspec data type - that would be fine too, but within Json Schema I can't think of one.

bobfox commented 3 years ago

I will poll the SunSpec Modbus workgroup this week to see if there are any objections to this change.

jsbosch commented 3 years ago

If not for strings, we could extrapolate size from the data type, but strings throw a spanner into that idea. So instead of having special processing for string point size vs all other point size, I just treat the size as mandatory and only use the type to determine how to process the data. So you won't hear an objection from me. (I'd second the idea, actually.)

BTW, I don't think Pad is used any more, at least not in the newer models.

wz2b commented 3 years ago

Some questions about sizes:

is count always just 1 register (16-bits)? If so, why isn't it just a uint16? Why do we need another type?
Same question about acc16, acc32, acc64 - why not use uint16, uin632, uint64?
What about the type "sunssf" - is this always 1 register as well?

If count is what I think it is, then you would be able to change its size depending on what you're counting, so why not create count16, count32, count64. If count is always 1 register then I'm not sure why it exists. If count is variable width (which is not totally crazy, if that's the intent) then why didn't acc* get the same treatment?

I just want to be clear before I go much further ....

--C

bobfox commented 3 years ago

Count was intended to support additional semantics for points used as a repeating count for groups of points. I think it is only used in model 160. We did not use it in the updated 700 series models but need to support it as it is an existing model. I am not sure it will be used in any models going forward. The acc series has been deprecated in usage. In the updated models (701) we have adopted the uintxx series for accumulators instead of acc. Now that we have a static designation for points, the sunssf designation may not be needed but I have not done any analysis. It continues to be used

bobfox commented 3 years ago

There have been no objections to making size mandatory in the JSON model definitions. We will regenerate all of the models with the size provided for each point. Going forward we will enforce the mandatory size when generating the model definitions. We will also update the schema.

wz2b commented 3 years ago

Fantastic, I know it's extra work but it will help make automated parsing more robust, thank you.

sunspec / models

Can "size" be mandatory? #198