w3c / wot-binding-templates

Web of Things (WoT) Binding Templates
http://w3c.github.io/wot-binding-templates/
Other
23 stars 25 forks source link

[Modbus-binding] type conversion based on byte and word order #293

Closed sebastiankb closed 11 months ago

sebastiankb commented 1 year ago

Since the PR #161 is looking for a generic solution to express the endianness via content type, and it seems that currently there is no practical solution in sight, I would like to put the focus here back on Modbus itself.

In my view, the most practical way would be to introduce modbus-specific term(s) that refer to a proper type conversion from or to a byte stream. That mean, we need information about the data type (e.g., signed or unsigned), the byte and word order. Here are some proposals (please note the term names and values are just proposals):

Proposal I (One term fits all)

After studying some Modbus libraries from different programming languages it seems not quite unusual to combine all related type conversion information into one term or function (such as, eg, libmodbus or buffer from nodejs).

That means for a TD we can introduce a term modbus:type that takes values such as uint32BB, uint32LB, uint32BL, uint32LL, floatBB, floatLB, floatBL, etc.

The first 'B' or 'L' character in the type value stands for bytes in big endian (=B) or little endian (=L) order. The second 'B' or 'L' character in the type value stands for word in big endian (=B) or little endian (=L) order.

This results into a huge list of combinations and would cause a big switch statement in the programming language.

Proposal II (Decoupling type and order)

The modbus:type only carries the data type with the sign (uint8, int8, uint16,...) and we have another term modbus:endian that provides the byte and word ordering by B, L, BB, BL, etc. (please note some types like uint8 will have no word ordering or it does not make sense).

The switch statements will be less, however, will cause some nested switch statements. This proposal allows working with default assumptions such as B or BB when modbus:endian is not present.

Proposal III (fine-grained)

Besides modbus:type there are terms for modbus:byteOrder and modbus:wordOrder. Both will take B and L for big and little endianness.

There will be a need of more nested switch statements. But we can also work with default assumptions when modbus:byteOrder and/or modbus:wordOrder are not present.

What do you think of the proposals? Are there any more?

a-hennig commented 1 year ago

I am in favour a variant of #1. (to avoid confusion make the second letter a "S" for swap). Reason:

relu91 commented 1 year ago

The problem that I'm seeing in introducing the new modbus term is how it plays along with the others. For example, in my understanding combining modbus:type with content types that are not application/octect-stream do not make sense. Therefore, we impose a new validation constraint on consumers. Moreover, we have to keep in mind that proposal 1 we are also describing the datatype (int, unint, float, etc.), this impose that modbus:type: "float" should not be used with type: "integer" (unless you want to support downcasting).

As always, I think we can have mid-term solutions, but the shortcomings above should be addressed in the specification. Plus I would also add a note saying that the term is experimental and we are looking for alternatives.

sebastiankb commented 1 year ago

For example, in my understanding combining modbus:type with content types that are not application/octect-stream do not make sense.

In my view, the byte and word order metadata only makes sense when octet-stream is used as content type. Maybe we can define this content type as default assumption for modbus binding?

Moreover, we have to keep in mind that proposal 1 we are also describing the datatype (int, unint, float, etc.), this impose that modbus:type: "float" should not be used with type: "integer" (unless you want to support downcasting).

Good point! The modbus type should not be in conflict with the interaction type. We have a similar situation with readOnly, writeOnly, and observable that may not be aligned with forms op values. In my view, this can be handled by a simple consensus check.

sebastiankb commented 1 year ago

I just found this nice overview here which we may simply follow as value options for modbus:type:

image

egekorkan commented 1 year ago

Here is what this looks like in the Siemens sayWoT implementation:

sebastiankb commented 1 year ago

Here is a regular expression of the table above:

^((u?int(8|16|32|64)|float|double)(be|le|sw|sb)|string(le)?)$
Where is
u=unsigned (absence means signed)
be=Big Endian
le=Little Endian
sw=Big Endian Word Swap
sb=Big Endian Byte Swap
egekorkan commented 1 year ago

Call of 22.11:

egekorkan commented 1 year ago

Known mechanisms (to be extended):

lu-zero commented 1 year ago

Other two libraries

egekorkan commented 1 year ago

@lu-zero I have gone into the source code of the two libraries above to see how they do it. It seems that none of those nor a Python Modbus library do the word swapping. I mostly see endiannes and that is done with a specific function argument. This means that it is the user of the library who needs to handle the byte manipulation. After this small research, my opinion is to go with the 3-word approach for the following reasons:

As @danielpeintner said, we need to specify which combinations are not allowed. Thinking again, I am not sure why a double type with a word swap is not mentioned above nor why a longer string has no word swap. In another direction, we can just allow all combinations since maybe that combination is really what they have in their device. We also do not prohibit the usage of the maximum keyword together with type:object at the data schema level.

Reasons to go with 1 word:

lu-zero commented 1 year ago

I like the 3 word approach and probably I'll open an issue on rmodbus about it later to see if they can support it in their API.

a-hennig commented 12 months ago

update: while I still think that the application/logical datatype needs to be a list (int16,flout32, date3, string32, etc), I would separate this list/type from the encoding/swapping. Also, because on every property, the logical type MUST be noted to make sense, the encoding/swapping tends to be same for an entire device/thing, and would hopefully some time be defined once on Thing level. (I know several protocol stack implementations, where you even have to choose this once per driver).

ie. I'm fine with 3-word (or 1+1)

egekorkan commented 12 months ago

So a list of keywords as an initial proposal:

sebastiankb commented 12 months ago

Another alternative to the orders can be binary flags:

egekorkan commented 12 months ago

I quite like the proposal from @sebastiankb since in the meantime I have learned that endianness and byte swapping are the same concept. Since different communities can use one term or the other, it is better to use a more neutral/verbose word that is more explicit. I would maybe just say modv:byteOrderChange since we are ordering the bytes (thus the bits too) in the first place.

I also had a talk with @relu91 and he mentioned mixed or middle endian concept. It seems that it is the same concept as word swap but for modbus users, word swap is a more used concept.

Some resources:

So in the end, I would separate the two concepts (as already proposed) and we do not need a value for middle endian.

relu91 commented 12 months ago

That's definitely good findings! plus one for @sebastiankb 's proposal.

sebastiankb commented 12 months ago

@egekorkan

I would maybe just say modv:byteOrderChange since we are ordering the bytes (thus the bits too) in the first place.

Would be also ok. Btw: This was also suggested in Proposal III above.

egekorkan commented 12 months ago

Call of 29.11:

We decided:

Additionally: