Closed cajun-rat closed 5 years ago
Hi,
And to be compatible with all IETF standards-track protocols (including TLS), all strings MUST be created by the Sender and normalized by the Receiver into UTF-8 encoding (RFC 3629) and normalized to comply w/ Net Unicode (RFC 5198) into Unicode Normalization Form C (NFC) defined in:
http://www.unicode.org/reports/tr15/
That "C" stands for "composed" (i.e, if a Unicode code point is defined that combines two glyphs, then use that code point). Note that Apple macOS (for historical reasons) instead uses "NFD" (decomposed) which makes strings longer and (without careful reordering) can introduce ambiguities because of the ordering of the decomposed code points.
Note: The use of UTF-16 (RFC 2781) MUST be PROHIBITED entirely in any Uptane implementation. A number of commercial applications still generate UTF-16, so a string conversion and normalization library on both the Sender and the Receiver is necessary.
Cheers,
Ira McDonald (Musician / Software Architect) Co-Chair - TCG Trusted Mobility Solutions WG Co-Chair - TCG Metadata Access Protocol SG Chair - Linux Foundation Open Printing WG Secretary - IEEE-ISTO Printer Working Group Co-Chair - IEEE-ISTO PWG Internet Printing Protocol WG IETF Designated Expert - IPP & Printer MIB Blue Roof Music / High North Inc http://sites.google.com/site/blueroofmusic http://sites.google.com/site/highnorthinc mailto: blueroofmusic@gmail.com PO Box 221 Grand Marais, MI 49839 906-494-2434
On Thu, Feb 14, 2019 at 10:13 AM cajun-rat notifications@github.com wrote:
In a couple of places in the spec we talk about comparing strings. Since these are likely to be Unicode, there is not a single method to perform a comparison. We should be explicit about which Unicode canonicalization should be used, and which comparison algorithm is implied when we say that a pair of 'Hardware Identifiers match' or a delegation's wildcard path matches a target.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/uptane/uptane-standard/issues/42, or mute the thread https://github.com/notifications/unsubscribe-auth/ATe6O9VHW7OrPphVzx7jjaU-iaEMFU_Nks5vNX0NgaJpZM4a7zZ9 .
Hi,
In the TUF area, we have been discussion creating shared wireline formats to allow for interoperability between implementations (details here: https://github.com/theupdateframework/taps/blob/21a2ee49b395346789074cc8ad8b73b5f89e5b0f/tap11.md). I think a version of this could be useful in allowing Uptane users to specify Unicode canonicalization (and comparison method) for their implementation. This might prevent the need for string conversion on ECUs.
-Marina
On Thu, Feb 14, 2019 at 10:52 AM iramcdonald notifications@github.com wrote:
Hi,
And to be compatible with all IETF standards-track protocols (including TLS), all strings MUST be created by the Sender and normalized by the Receiver into UTF-8 encoding (RFC 3629) and normalized to comply w/ Net Unicode (RFC 5198) into Unicode Normalization Form C (NFC) defined in:
http://www.unicode.org/reports/tr15/
That "C" stands for "composed" (i.e, if a Unicode code point is defined that combines two glyphs, then use that code point). Note that Apple macOS (for historical reasons) instead uses "NFD" (decomposed) which makes strings longer and (without careful reordering) can introduce ambiguities because of the ordering of the decomposed code points.
Note: The use of UTF-16 (RFC 2781) MUST be PROHIBITED entirely in any Uptane implementation. A number of commercial applications still generate UTF-16, so a string conversion and normalization library on both the Sender and the Receiver is necessary.
Cheers,
- Ira
Ira McDonald (Musician / Software Architect) Co-Chair - TCG Trusted Mobility Solutions WG Co-Chair - TCG Metadata Access Protocol SG Chair - Linux Foundation Open Printing WG Secretary - IEEE-ISTO Printer Working Group Co-Chair - IEEE-ISTO PWG Internet Printing Protocol WG IETF Designated Expert - IPP & Printer MIB Blue Roof Music / High North Inc http://sites.google.com/site/blueroofmusic http://sites.google.com/site/highnorthinc mailto: blueroofmusic@gmail.com PO Box 221 Grand Marais, MI 49839 906-494-2434 <(906)%20494-2434>
On Thu, Feb 14, 2019 at 10:13 AM cajun-rat notifications@github.com wrote:
In a couple of places in the spec we talk about comparing strings. Since these are likely to be Unicode, there is not a single method to perform a comparison. We should be explicit about which Unicode canonicalization should be used, and which comparison algorithm is implied when we say that a pair of 'Hardware Identifiers match' or a delegation's wildcard path matches a target.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/uptane/uptane-standard/issues/42, or mute the thread < https://github.com/notifications/unsubscribe-auth/ATe6O9VHW7OrPphVzx7jjaU-iaEMFU_Nks5vNX0NgaJpZM4a7zZ9
.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/uptane/uptane-standard/issues/42#issuecomment-463679014, or mute the thread https://github.com/notifications/unsubscribe-auth/ALLzkXNIYQ7ZGWuB-n8Z2-gEsitHH_d1ks5vNYYsgaJpZM4a7zZ9 .
Hi Marina,
Interesting - thanks.
The reason that I mentioned ECU-side string conversion is that the native RTOS APIs as well as many/most application libraries do NOT actually exchange NFC canonical UTF-8, so the ECU will have to do some string conversion before sending (and some on receiving before pouring into local APIs).
Cheers,
Ira McDonald (Musician / Software Architect) Co-Chair - TCG Trusted Mobility Solutions WG Co-Chair - TCG Metadata Access Protocol SG Chair - Linux Foundation Open Printing WG Secretary - IEEE-ISTO Printer Working Group Co-Chair - IEEE-ISTO PWG Internet Printing Protocol WG IETF Designated Expert - IPP & Printer MIB Blue Roof Music / High North Inc http://sites.google.com/site/blueroofmusic http://sites.google.com/site/highnorthinc mailto: blueroofmusic@gmail.com PO Box 221 Grand Marais, MI 49839 906-494-2434
On Fri, Feb 15, 2019 at 4:09 PM mnm678 notifications@github.com wrote:
Hi,
In the TUF area, we have been discussion creating shared wireline formats to allow for interoperability between implementations (details here:
https://github.com/theupdateframework/taps/blob/21a2ee49b395346789074cc8ad8b73b5f89e5b0f/tap11.md ). I think a version of this could be useful in allowing Uptane users to specify Unicode canonicalization (and comparison method) for their implementation. This might prevent the need for string conversion on ECUs.
-Marina
On Thu, Feb 14, 2019 at 10:52 AM iramcdonald notifications@github.com wrote:
Hi,
And to be compatible with all IETF standards-track protocols (including TLS), all strings MUST be created by the Sender and normalized by the Receiver into UTF-8 encoding (RFC 3629) and normalized to comply w/ Net Unicode (RFC 5198) into Unicode Normalization Form C (NFC) defined in:
http://www.unicode.org/reports/tr15/
That "C" stands for "composed" (i.e, if a Unicode code point is defined that combines two glyphs, then use that code point). Note that Apple macOS (for historical reasons) instead uses "NFD" (decomposed) which makes strings longer and (without careful reordering) can introduce ambiguities because of the ordering of the decomposed code points.
Note: The use of UTF-16 (RFC 2781) MUST be PROHIBITED entirely in any Uptane implementation. A number of commercial applications still generate UTF-16, so a string conversion and normalization library on both the Sender and the Receiver is necessary.
Cheers,
- Ira
Ira McDonald (Musician / Software Architect) Co-Chair - TCG Trusted Mobility Solutions WG Co-Chair - TCG Metadata Access Protocol SG Chair - Linux Foundation Open Printing WG Secretary - IEEE-ISTO Printer Working Group Co-Chair - IEEE-ISTO PWG Internet Printing Protocol WG IETF Designated Expert - IPP & Printer MIB Blue Roof Music / High North Inc http://sites.google.com/site/blueroofmusic http://sites.google.com/site/highnorthinc mailto: blueroofmusic@gmail.com PO Box 221 Grand Marais, MI 49839 906-494-2434 <(906)%20494-2434>
On Thu, Feb 14, 2019 at 10:13 AM cajun-rat notifications@github.com wrote:
In a couple of places in the spec we talk about comparing strings. Since these are likely to be Unicode, there is not a single method to perform a comparison. We should be explicit about which Unicode canonicalization should be used, and which comparison algorithm is implied when we say that a pair of 'Hardware Identifiers match' or a delegation's wildcard path matches a target.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/uptane/uptane-standard/issues/42, or mute the thread <
.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/uptane/uptane-standard/issues/42#issuecomment-463679014 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ALLzkXNIYQ7ZGWuB-n8Z2-gEsitHH_d1ks5vNYYsgaJpZM4a7zZ9
.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/uptane/uptane-standard/issues/42#issuecomment-464200153, or mute the thread https://github.com/notifications/unsubscribe-auth/ATe6Oy25JHYWM5QQmtnG9UNVywcVK5oBks5vNyIEgaJpZM4a7zZ9 .
Today we resolved to make the requirements `unique encoding', etc. in the specification. @iramcdonald , would you kindly help?
When Mike + others from Airbiquity and @awwad / @mnm678 write their profiles, they will have this level of specificity.
I've opened up https://github.com/uptane/uptane-standard/pull/84 to address this, or at least start to.
On the 03/13 standards call, we noted that there's a PR open and awaiting review. Once it's reviewed/accepted/merged, we can close this.
In my view, mandating that all strings in the metadata conform to RFC5198 takes care of the string comparison issue. #84 is merged after an approving review, so I am going to close this. If anyone thinks it's not yet resolved, please leave a comment and I'll re-open.
In a couple of places in the spec we talk about comparing strings. Since these are likely to be Unicode, there is not a single method to perform a comparison. We should be explicit about which Unicode canonicalization should be used, and which comparison algorithm is implied when we say that a pair of 'Hardware Identifiers match' or a delegation's wildcard path matches a target.