Closed erights closed 8 months ago
Under compatibility considerations, this will cause any existing program that is transmitting unpaired surrogates to break. I do not think we need to treat this as a breaking change, but we should note in NEWS.md
that we are hoping that nobody is exercising the broken behavior, so that when someone upgrades and if their application breaks, they have a hint about why.
I think this is a good change but it is marked as draft. Is that intentional? This is in my review inbox. Is this ready for review?
I think this is a good change but it is marked as draft. Is that intentional? This is in my review inbox. Is this ready for review?
It lacks only tests, filling out the PR comment and checklist, and NEWS.md
text.
I do not think we need to treat this as a breaking change
Should I remove the "!"?
It also lacks adequate comments in the code
"!" removed
Under compatibility considerations, this will cause any existing program that is transmitting unpaired surrogates to break. I do not think we need to treat this as a breaking change,
I removed the "!". Done.
but we should note in
NEWS.md
that we are hoping that nobody is exercising the broken behavior, so that when someone upgrades and if their application breaks, they have a hint about why.
Done.
Given that I removed the "!", just confirming that I should not Includes *BREAKING*:
in the commit message, letting NEWS.md do the work. Yes?
Is this ready for review?
yes
It lacks only tests,
Done
filling out the PR comment and checklist,
Done, with the first box unchecked.
and NEWS.md text.
Done
It also lacks adequate comments in the code
Done
With this PR, the check will often be O(n) in the length of the string
How often? What's the wall-clock impact on the Agoric blockchain? I hope we don't land this until we have pretty clear data on that.
I do not think we need to treat this as a breaking change ...
I don't understand why not. It's clearly a change that's observable from clients.
I don't understand why not.
@kriskowal , I leave that to you. I'm willing to go back to "!" if you think I should.
How often? What's the wall-clock impact on the Agoric blockchain? I hope we don't land this until we have pretty clear data on that.
While I agree, can someone else take on actually measuring this?
@gibson042 , thanks for measuring that!
I do not think we need to treat this as a breaking change ...
I don't understand why not. It's clearly a change that's observable from clients.
This is a tree falls in the forest situation. Unpaired surrogates are only likely to show up if you’re doing something silly like abusing strings as Uint16Arrays (I admit having seen this before Uint16Array). So, I agree this is observable and I’ve asked around for whether it would actually be observed. If we can guarantee not, it would be nice because bumping the major version increases chances of contracts retaining multiple versions of this package and obliges us to back-port security patches.
I am also fine either way. Short of asking @mhofman to replay the chain with this change, I don’t know a way to definitely prove that this would not break the chain. And, I also don’t know whether that’s germane.
At https://github.com/endojs/endo/pull/2008#pullrequestreview-1844906628 we decided to add the "!" back in to this PR.
Done.
Now I need to add an appropriate *BREAKING*:
to the commit message.
I'm interested to learn whether the breaking change judgement is more of a math problem (does there exist a test by which I could observe it?) or a cost-benefit / policy sort of thing.
This is a tree falls in the forest situation. ... I’ve asked around for whether it would actually be observed. If we can guarantee not ...
A guarantee would involve a closed-world assumption, yes? i.e. an assumption that we can somehow find all the clients. Is our working model that we can, in fact do that?
Thought experiment: have clients send a sort of "warrantee registration card" - if you let us know that you depend on our stuff, we'll make every effort to consider your stuff when judging breaking changes etc.
likely to be replaced by native code in the relatively near future.
I'd like to see 2 issues:
s.isWellFormed()
in xsnap (i.e. upgrading XS version, adding a test, ...)Maybe those (or issues that subsume them) exist already?
@kriskowal @gibson042 , I hid this new feature behind a feature flag, defaulting to disabled, so we could include it in the upcoming endo release, to start experimenting with in from agoric-sdk. Even though you already approved, it is a big enough change that I'd appreciate it if you could PTAL before I merge. Thanks!
closes: #1739 refs: https://github.com/ocapn/ocapn/issues/47 https://github.com/tc39/proposal-is-usv-string
Description
At an OCapN meeting, we resolved https://github.com/ocapn/ocapn/issues/47 by deciding that only well-formed Unicode strings may be passed. A well-formed Unicode string does not contain any unpaired surrogates. It consists only of a sequence of Unicode code points. A conforming OCapN implementation MUST only emit well-formed Unicode strings, and MUST validate that incoming strings are well-formed, and reject those that are not.
Within the Agoric implementation, the way to implement these restrictions is by having
passStyleOf(str)
only judge well-formed string to be passable, rejecting all other strings as not passable.If the underlying engine does not yet implement
isWellFormed
, we fall back to @gibson042 's shim implementation at https://github.com/ocapn/ocapn/issues/47#issuecomment-1908685478 .Update: Because we do not yet know the performance impact, this PR hides this new feature behind a feature flag
that defaults to disabled. Unless enabled, there should be no observable difference for now, i.e., any JavaScript string would still be considered Passable. See this PR's NEWS.md update for more.
Security Considerations
If enabled, this change improves integrity because it reduces the attack surface. By rejecting incoming strings that are not well-formed, a counterparty cannot use such a string to push internal algorithms into cases they may not have tested.
Scaling Considerations
Before this PR or with this feature disabled (currently the default),
passStyleOf
would judge a string to be Passable based simply ontypeof
being'string'
, which is O(1). With this feature enabled, the check will often be O(n) in the length of the string, depending on whether and how the underlying engine implementsisWellFormed
. If the underlying engine does not yet implement it, our shim implementation takes O(n). In theory, a builtin implementation might remember that a string is well-formed, enabling an O(1) test, at least after the first time. However, we are not aware of any such engine optimizations.When the argument to
passStyleOf
is an object that it judges Passable,passStyleOf
memoizes its judgement so it need only make the expensive check once. However, because of the impossibility of having a user-level weak data structure weakly indexed by strings, it is impossible for user-level code to do such memoization for strings. We should measure to see if this is a problem in practice.Documentation Considerations
https://github.com/Agoric/agoric-sdk/blob/master/docs/env.md should be updated to explain the feature flag and the resulting restrictions on passable strings.
Testing Considerations
The first time this PR ran through CI with code enforcing the restriction, but not yet any code to test the restriction, it is unfortunate that CI came up green. This means that before this PR, when we were accepting non-well-formed strings by design, nothing tested that case, at least in a way that caused a test to break because of the added enforcement.
Compatibility Considerations
Enabling this feature is in theory a compatibility break, in that previously non-well-formed strings were supposed to work. However, aside from possible tests specifically about non-well-formed strings, we do not expect any actual code to break. See @kriskowal 's note below at https://github.com/endojs/endo/pull/2002#issuecomment-1908857790
With the feature disabled, which is currently the default, there should be no compat issue at all.
Upgrade Considerations
With the feature defaulting to disabled, there is not yet an breaking change or upgrade issue.
*BREAKING*:
in the commit message with migration instructions for any breaking change.~NEWS.md
for user-facing changes.