Open ezdiy opened 7 years ago
@ansuz The problem here is that SSB in fact has a canonical form - just a ridiculous one: Keep fields as they come in, and attach 'signature' to the end.
This is called insert order, and it is very awkward when you convert the json to different representation. Basically you need to keep everything in original form, or serialize/deserialize to your format while keeping metadata about ordering so you can rematerialize the original exactly as it came in.
The proposed fix is to set ordering of the top fields (author, timestamp, etc) in stone (namely, to the thing node currently is using), since those fields are treated by the network specially anyway.
Oh btw, this document was the influence on the ssb signing format: https://camlistore.org/doc/json-signing/
Another approach, would be to always keep a handle on the original serialized json text, and parse that use it, but never serialize it again, instead send the originally serialized json, inserting/extracting the signature with a regexp.
Also, firefox and chrome both support that ordering - so it's a de facto standard
oh, btw, we could integrate this by canonicalizing messages when they are created, which would still produce valid signatures for legacy implementations, but be in canonical format.
@dominictarr The idea of the above script is to canonicalize both on creation and verification, but only the top fields. I don't have all the feeds, but the ones I've checked thus far do have this pseudo-canon order.
As for contents of 'content', the spec of it will always remain awkward. It basically says: the string supplied to 'content' field should be padded by 2 spaces on each newline. Because reasons. It can't be canonicalized per se, only tracked as a string (weakref via JSON parser hook).
Another fix would be simply documenting current behavior: all dicts which appear in json are ordered (similiar to python ordered dict), and all objects are represented as one receives those on the wire - no special exceptions. It basically means people will have to re-implement json serializers to accomodate this.
Again, minor oddity compared to even more odd things ssb does.
I'd really like it if we can find a solution for this. It doesn't have to be a golden catch-all thing but at least something that works for the foreseeable future until we now how to upgrade out of this mess while keeping our sanity.
@ezdiy' proposal to commit to what we have been doing in the past sounds very reasonable to me and also is something that I feel much better maintaining than what ever v8 feels like today.
A simplest thing we could do - don't touch sbot node code at all, but make the current top-field order part of verification rule. All ssb feeds will pass the check now, but it could cease to be the case if we don't in the future.
One unresolved thing I have no idea how to handle (the above script simply omits those): What to do about protocol-unused fields, that is, some other field not in the ['previous','author','sequence','timestamp','hash','content','signature']
set? What is the ordering of those should it be even allowed?
And found another one:
if (_last === time) {
do {
adjusted = time + ((_count++) / (_count + 999))
} while (adjusted === _adjusted)
_adjusted = adjusted
}
It produces decimally non-representable numbers, and then compares those using == operation, which is not valid for floats - you'll get a float smaller or bigger one at pretty much random depending on rounding, but almost never same one.
Worse still, JSON has no defined rounding rules for fine fractions. Apparently it's anything between 8-12 decimal places, but quite likely it won't survive re-serialization anyway.
Most protocols with monotonic timestamping simply use integer timestamps, and special case same timestamps if they have a sequence number - that one decides lexicographical order.
Interestingly, @Vendan in spite of all his Go elitism, introduced similiar bug in his Go port. Though with less impact - 2 decimal places are quite likely to fit for at least 100 years.
@ezdiy hmm, not sure what you mean, can you show me an input with this problem?
we can easily relax the input to be >= not just > but we'd have to wait a while till everyone is running the updated code.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I think this is still an issue and @AljoschaMeyer might have a thing or two to add. At least as a forecast, maybe.
It's somewhat resolved. The spec defines a canonical format. The js implementation implements that format, but does so by relying on some unspecified behavior.
As far as I am aware, the only reliance on unspecified behavior is the assumption that non-numeric keys of the content object preserve their order across JSON.stringify(JSON.parse(foo))
transformations.
@AljoschaMeyer btw, did you see https://github.com/ssbc/ssb-validate/pull/14
@dominictarr I answered to that one on ssb: %RFH6ayPX5I5+cMS3MXVbhZ16ykXBtayc9YGba0tc/xk=.sha256
There's no canonical form to hash/sign, it depends on v8 object insert-order and Go people complained (rightfuly so) about difficulty of re-implementing this quirk. This can be fixed:
Should v8 change object key ordering in the future, scuttlebot will cease to function completely until it starts keeping original string forms for 'content' (can be done via weakref f.e.).