endojs / endo

Endo is a distributed secure JavaScript sandbox, based on SES
Apache License 2.0
829 stars 72 forks source link

fix(marshal)!: compare strings by codepoint #2008

Open erights opened 9 months ago

erights commented 9 months ago

closes: #2113 refs: #2002

Description

Security Considerations

The fact that the string ordering is closer to the Unicode semantics of the strings probably minimizes some surprises in ways that help security. OTOH, this difference from JS native string ordering probably causes other surprises that hurt security. Altogether, we do not expect much effect.

Scaling Considerations

As a comparison written in JS, will be slower that the JS native string comparison. On XS at least, we expect to have a native code point comparison function available eventually. Altogether, we do not expect much effect.

Documentation Considerations

Most developers will not care. But it needs to be explained somewhere carefully so that developers that do care can easily find out.

Testing Considerations

@gibson042 , in a later PR, could you expand the property-based-testing to generate test cases sensitive to this change?

Compatibility Considerations

Upgrade Considerations

If we currently have any persistent data, especially on chain, sorted according to JS native order (UTF16 code unit), then we cannot accept this PR until we have a plan to resort that data, or somehow continue to live with mis-sorted. (Historical note: This is how Oracle came to permanently rely on UTF16 code unit order, because of the impracticality of resorting all that data.)

erights commented 9 months ago

Excellent.

For this change, I do not think we can avoid the breaking change marker. That might render my argument for leaving it out of pass-style, moot.

Let me be sure I understand:

You're saying that this PR should keep the "!". Given that, we may as well keep the "!" on #2002 as well. Right?

kriskowal commented 9 months ago

Yes

On Thu, Jan 25, 2024 at 5:47 PM Mark S. Miller @.***> wrote:

Excellent.

For this change, I do not think we can avoid the breaking change marker. That might render my argument for leaving it out of pass-style, moot.

Let me be sure I understand:

You're saying that this PR should keep the "!". Given that, we may as well keep the "!" on #2002 https://github.com/endojs/endo/pull/2002 as well. Right?

— Reply to this email directly, view it on GitHub https://github.com/endojs/endo/pull/2008#issuecomment-1911279409, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAOXBRXXSUOYVBOWVGDT4TYQMDLVAVCNFSM6AAAAABCLFMFD2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJRGI3TSNBQHE . You are receiving this because you commented.Message ID: @.***>

erights commented 9 months ago

Just noting here for curiosity. In the UTF16 portion of https://icu-project.org/docs/papers/utf16_code_point_order.html

This opens the door for a "fix-up" of code unit values that is faster than assembling 21-bit code point values.

OMG