usdot-jpo-ode / wzdx

The Work Zone Data Exchange (WZDx) Specification aims to make harmonized work zone data provided by infrastructure owners and operators (IOOs) available for third party use, making travel on public roads safer and more efficient through ubiquitous access to data on work zone activity.
Creative Commons Zero v1.0 Universal
89 stars 62 forks source link

message_multi_string type for DynamicMessageSign of SwzDeviceFeed incompatible with NTCIP #218

Closed dxpack closed 2 years ago

dxpack commented 2 years ago

Issue name: “message_multi_string type for DynamicMessageSign of SwsDeviceFeed incompatible with NTCIP”

Summary

The SwzDeviceFeed contains a DynamicMessageSign with a message_multi_string property with a String type, and references NTCIP 1203v3. The MULTIString property in NTCIP is not a String type but an Octet String, which can contain binary data.

Motivation

The DynamicMessageSign object states the String value can be an empty string if the message is unknown due to an error. A binary-inclusive MULTIString, such as used for graphic messages is not an error according to NTCIP.

Proposed changes

Either:

@j-d-b: edited to remove PR references and add links to released spec as proposed changes are now released.

Dunge commented 2 years ago

It is only supposed to store text messages. I believe keeping it as string here is correct.

Graphics are stored under the dmsGraphicBlockBitmap object and referenced in the dmsMessageMultiString by using the [Gx] multi tag, it is not valid to store them directly in the dmsMessageMultiString object.

Reference From NTCIP 1203:

5.6.8.3 Message MULTI String Parameter
dmsMessageMultiString OBJECT-TYPE
SYNTAX OCTET STRING
ACCESS read-write
STATUS mandatory
DESCRIPTION
"<Definition> Contains the message written in MULTI-language as defined in
Section 6 and as subranged by the restrictions defined by
dmsMaxMultiStringLength and dmsSupportedMultiTags. When the primary index is
'schedule', 'blank', 'currentBuffer' or 'permanent', this object shall return
a genErr to any SET-request. When the primary index is 'schedule', the object
shall return the MULTI string of the currently scheduled message in response
to a GET-request (regardless whether this message is actually being
displayed). The value of the MULTI string is not allowed to have any null
character.
<Object Identifier> 1.3.6.1.4.1.1206.4.2.3.5.8.1.3"
::= { dmsMessageEntry 3 }

So why is it defined as the "OCTET STRING" type instead of DisplayString like others string? I believe this was originally intended to support more encoding type (transform from supporting only ASCII to UTF-16 before UTF-8 became a thing).

Further quote under the section 6 of 1203:

An object that makes use of MULTI has a syntax of octet string. The octet string shall conform to the MULTI language.

and

The Markup Language for Transportation Information (MULTI) is similar to HTML where text is transmitted, and tags define how the text appears (is displayed). Tags are enclosed within delimiters, contain an ID (one or more characters), and any optional parameters necessary for the tag. MULTI currently uses 8-bit characters, but there is consideration and planning to allow the selection of either 8-bit or 16-bit characters. The null character (0x00) is not allowed within MULTI strings. All of the MULTI tags are defined in ASCII, 8-bit characters with the most significant bit set to 0.

dxpack commented 2 years ago

It is valid to store manufacturer-specific MULTIString tags, which can contain binary data. Ultimately, dmsMessageMultiString type in NTCIP is Octet String (literally a sequence of bytes), which is not limited to data compatible with a String type.

Dunge commented 2 years ago

Technically the manufacturer tag value description also mention "string", any byte value of 0 would be invalid, and it is said a few places in the document that this node is ASCII only.

But if you used manufacturer tag as a workaround to store additional binary data, I absolutely understand and do not want to create needless blockage, we need this standard to be more open to the masses than restrictive.

The only disadvantage I see transforming from type string to byte array in the json schema would be that the messages would not be human-readable anymore when looking at the json file. It would list an array of integers instead of the text.

As a separate note, this make me realize there might be an important information missing in the schema, when messages contains graphics as supposed to be defined in NTCIP (in a separate object referenced by the [Gx] tag in the multistring in opposition as to be included as part of a custom manufacturer tag), maybe the DynamicMessageSign object could also include a list of objects which are the included graphics (byte array for their content and integer their numbers).

dxpack commented 2 years ago

I suspect that the general use of "string" parlance in the NTCIP spec is related to the time period and type of engineers who were involved in drafting the spec - where multi-byte binary data is referred to as a string of bytes (string of octets). That is common for C developers.

In my experience most MULTIString values are ASCII. In the cases where I have seen binary MULTIStrings, they have generally been within the manufacturer-specific tag, which by virtue would be incompatible with the goal of creating an industry-wide specification. So I would recommend the 1st proposed solution - update documentation to allow binary-inclusive MULTIStrings to be represented as an empty JSON String.

In regards to your separate note, there are also Field Code tags. These are effectively variable names where the underlying value can change and on the display would fill the allotted space of the Field Code in the MULTIString. Some are predefined by NTCIP but manufacturers can also create their own. In either case, the underlying value is not present in the MULTIString, so without some other provision to supply those values, the MULTIString in WZDx would not have the complete data as displayed on the DMS. Any effective solution within WZDx to handle the NTCIP built-in Field Codes would probably prove challenging to handle the manufacturer-specific Field Codes.

dxpack commented 2 years ago

Further details on the Field Codes - I have worked on an AWIZ project where the goal was to display a dynamic message that included a highway exit name that was currently closed. The specific exit was going to periodically change. As there is no NTCIP Field Code for a highway exit, one version of the AWIZ implementation engineering spec called for the creation of a manufacturer-specific Field Code (though there are alternative methods to Field Codes to accomplish the same goal, depending on other factors of the AWIZ functional spec, such as where and how the variable needed to be updated, at the sign or at the server).

Dunge commented 2 years ago

I also vote with the 1st proposed solution, as I believe keeping the message as an human readable string in the document to be important. If the object is special and contains binary data, yes it should be mentioned that it is possible and valid in the documentation. I fear a empty string would make believe the sign is blanked when it's not, but I don't see how else other than adding additional fields.

As for Field Codes, you are right. Their usage is not as widespread as Graphics, but there's so many other things that represent the content of the displayed message that is not part of the dmsMessageMultiString object. Opening this box lead to too much information. There are tags for dynamic fields (temperature, time of day), but there's also the font definitions, the default values (alignment, line spacing, character spacing), foreground/background colors, etc. I don't believe we need that kind of specific information for this standard.

dxpack commented 2 years ago

I agree, an empty string is misleading (even for just an error condition on the MULTIString).

I think there could be a reasonable method of including Field Codes, even manufacturer-specific, in WZDx, perhaps using an Object where the Object key names are the Field Code identifiers that exist in the current MULTIString and the Object values are the current DMS values for those Field Codes. Field Codes are very useful for automated systems. DMS-internal sensor, or minimally intelligent external sensor driven messages are possible without the need for the sophistication of an external system to use NTCIP to create and activate a new message with a hard coded value.

Fonts can also determine the fundamental meaning of a displayed message by means of ASCII graphics - simple arrows/chevrons/etc. An ASCII message similar to "MERGE <----" might use Hex Codes for the arrow, but without access to the Font Characters, those Hex Codes would be meaningless outside of viewing the displayed message. But yes, font dynamics would be a massively detailed addition to WZDx.

dxpack commented 2 years ago

Added empty string issue: IS #220

jlstanley-git commented 2 years ago

There's an incorrect assumption in the OP in this thread. While an octet string can contain binary data, a dmsMessageMultiSting can not. In the 1203v03 description of DMS support for NTCIP, section 5.6.8.3 (the definition of dmsMessageMultiString) it specifically states that "The value of the MULTI string is not allowed to have any null character.". So that rules out including raw binary data in the string.

(As a best-practices note, while the definition specifies that dmsMessageMultiString is a limited form of octet string, just to keep things readable, it's best to use the hex-char [hcX] tag any time you need character codes outside the range of 0x20...0x7F.)

It is valid to store manufacturer-specific MULTIString tags, which can contain binary data.

No. manufacturer specific tags can specify almost any format they want. BUT, that format MUST conform with the other rules for dmsMessageMultiString. So no null bytes, and no single ']' characters that don't end the tag... So, they also can't include raw binary.

So, long story short, leaving message_multi_string as a string is perfectly fine.

dxpack commented 2 years ago

That's right. Thank you for the correction. I'll close this issue.