Open ara4n opened 7 years ago
mostly that it means that you can include whitespace in your json responses for readabilty.
However, it then means that receivers would need to handle the PDU as both string-representation and JSON.
No? You could still do it the same way as we do today.
The problem here is that hashes and signatures of messages don't actually represent the messages; they only represent a transform of the messages (canonization). This is a ciphertext malleability. The adversary is sending messages which they did not hash and sign. The expectation is that the verifier will "fix" the adversary's message by hashing it according to the transform. The verifier must then only refer to their corrected message as a valid verified message. This has several practical side-effects:
It increases the attack surface before message verification by requiring a potentially expensive transform, rewriting the message on behalf of the attacker. The honest verifier should have to do as little work as possible on an unverified message, reducing the cost of DoS attempts and other potential exploits in the call tree before verification.
It turns the reception of messages off the socket from a simple zero-copy read to a linear buffer, involving read-only parsing of the JSON to a tree of const char *
pointers (again, write-free, allocate-free, copy-free view of the data) into a whole complication of either rewriting the message or selectively scanning the message with the hashing and signing contexts to skip regions and characters which aren't canonical (if that's even possible without rewriting the message).
For client-server communication, there are reasonable arguments for being liberal in what the server accepts for compatibility, for friendliness to developers of all levels, and there also aren't the same cryptographic necessities. For server-server (federation) communication, I see no reason to ever send or accept a message which isn't unambiguously canonical and instantly verifiable.
There are a smaller number of server implementations than client implementations. It is easier and more realistic to hold tighter expectations without seeing divergence.
Server implementations have a higher expectation for expertise than client implementations: It is reasonable to accept a little more of a difficult threshold for correctness to be met than with a client. Readability is not as important. The payloads are still human-readable JSON. Server authors are more used to this jumble than client authors...
Federation protocol should err with this point on the side of optimization rather than compatibility. A homeserver with many users might spend resources on matters of compatibility with some of its clients some of the time during their occasional interactions; two homeservers with many users each conveying many messages back and forth to each other shouldn't have to over and over for all messages all the time.
@erikjohnston this may be relevant to your interests
This is kind of all true, but it's hard to get worked up over either the cpu overhead of parsing and re-encoding the json (generally I'd expect the cryptographic verification to dominate, though I haven't checked), or the attack surface thereof (if your json parser is insecure, then you're comprehensively stuffed anyway).
In any case, the verification algorithm requires more than reparsing: you also have to strip the existing signatures and redact the event. Which takes zero-copy implementations off the table (or at least requires some very custom parsing).
All that being said: I do see the point. There are good arguments for requiring that the federation APIs use canonical json.
This is kind of all true, but it's hard to get worked up over either the cpu overhead of parsing and re-encoding the json
But these things do get me worked up because ultimately a matrix server is a message passing device and we want to pass as many messages as possible, like any other network router.
I'd expect the cryptographic verification to dominate
Verification is dominated by hashing and io is slowed when you have to stop hashing because you hit some non-canonical sequence like an escaped /
in a string which you have to skip and start hashing again after... Something like that will throw a wrench into the pipeline for tight loop of hardware accelerated streaming instructions for SHA which expect aligned 64 byte blocks.
It seems what I really need to know is: how far does a server have to go to correct a non-canonical message before attempting to verify the correction? Is this just limited to the /
discrepancy or are there more discrepancies?
As @richvdh says,
In any case, the verification algorithm requires more than reparsing: you also have to strip the existing signatures and redact the event. Which takes zero-copy implementations off the table (or at least requires some very custom parsing).
The full story is at https://matrix.org/docs/spec/server_server/unstable.html#signing-events
The full story is at https://matrix.org/docs/spec/server_server/unstable.html#signing-events
That doesn't answer the question. How far does a server have to go to correct non-canonical JSON? If you send me a message and it's missing a trailing }
character, do I correct that for you and then verify the correction?
In any case, the verification algorithm requires more than reparsing: you also have to strip the existing signatures and redact the event. Which takes zero-copy implementations off the table (or at least requires some very custom parsing).
This is not true. I can still do this operation without any writes. The signatures object "removal" isn't a "removal" because this isn't python. I just skip that pointer in my object vector. But this discrepancy now involves deep inspection of string values deep within event content of potentially multi-kilobyte payloads. The JSON parser is made efficient by the fact that it does as little as possible over a payload because a matrix server, again, is like a router and the primary event fields are like the outer headers of a packet.
You guys are obviously using some JSON library or set of JSON libraries and taking some of their internal behaviors for granted to the point where it's becoming your own specification. But that's not your specification and that's not cryptographically sound. Your spec says nothing about signing different messages than what's being transmitted and if it did then your spec would suck. I have never worked with a protocol that requires anything like this. This has to be fixed. If I hadn't known to ask about it in #matrix-dev I would've written off your own server as insecure and you'd have a fork in your federation already.
Furthermore, the fix for this doesn't look very difficult. Turn off the escaped /
option in synapse's JSON library. All of your data is already signed canonically and doesn't have to be re-signed. The sooner this gets pushed the better.
You know what, I'm not done here yet. There are apparently many variations of JSON and when I first read your spec which detailed grammar for a canonical JSON I figured that was your way of coming down unambiguously on one flavor.
If your approach is to treat all JSON with postel's law then I don't even see a reason for canonical JSON in the first place because it really doesn't matter what is signed by another party -- all that matters is whether the party signed it, and whether the verifying party accepted it. If the verifying party adheres to RFC JSON then they will both verify and accept messages that would have been accepted but unverified under canonical JSON. If you think about this for a moment there really isn't a reason for canonical JSON at all. The remote party still has to accept the whole phonebook of variations on JSON either way. Again, all that we care about is whether the signing party actually sent the message and we can do that by simply verifying its bytes.
In reality the choice here is to either dump canonical JSON or enforce it as your specification of JSON.
That doesn't answer the question. How far does a server have to go to correct non-canonical JSON? If you send me a message and it's missing a trailing } character, do I correct that for you and then verify the correction?
Our dialect of canonical-form JSON is very clearly spelt out, complete with BNF, at https://matrix.org/docs/spec/appendices.html#canonical-json. The point is to express JSON, which is a very ambiguous format, in a single canonical way that can be consistently signed. It has absolutely nothing to do with fixing syntax errors like missing trailing } characters: if the federation API receives data which is invalid JSON then it should of course reject it.
In any case, the verification algorithm requires more than reparsing: you also have to strip the existing signatures and redact the event. Which takes zero-copy implementations off the table (or at least requires some very custom parsing).
This is not true. I can still do this operation without any writes.
Only because Synapse's JSON library happens to output something that is coincidentally quite close to the canonical JSON representation. If it happened to reorder the keys, change the whitespace, alter numeric field representations etc then you'd be completely out of luck.
You guys are obviously using some JSON library or set of JSON libraries and taking some of their internal behaviors for granted to the point where it's becoming your own specification.
No, we're not; it's the opposite. You're jumping to incorrect conclusions.
Your spec says nothing about signing different messages than what's being transmitted and if it did then your spec would suck.
The spec says "Each PDU is itself a JSON object containing a number of keys, the exact details of which will vary depending on the type of PDU" - i.e. plain old JSON, rather than mandating canonical form on the wire. As per the original bug, there's certainly an argument for transmitting it in canonical form, but I wholeheartedly agree with @richvdh that this is not remotely a big deal, and whilst we could probably land it in a v2 of the API as a convenience to server implementers and to let the receiver ditch badly signed data as rapidly as possible, there are many many many more significant defects or missing features that we should be focusing on at this point in the project.
This has to be fixed. If I hadn't known to ask about it in #matrix-dev I would've written off your own server as insecure and you'd have a fork in your federation already.
Please feel free to propose stuff like this for a v2 API for federation; we're currently looking at doing a version bump anyway to address some issues in the state resolution consensus algorithm, and it'd be good to get them in. Another obvious improvement would be to store the signatures & redactable-bits of events in a separate chunk of JSON so one doesn't have to skip fields when signing the event.
Of course, if you want to write a homeserver that only accepts canonical-form JSON over federation, then you are welcome to do so. It might get a bit lonely out there though.
Furthermore, the fix for this doesn't look very difficult. Turn off the escaped
/
option in synapse's JSON library. All of your data is already signed canonically and doesn't have to be re-signed. The sooner this gets pushed the better.
Yup, we could try to cheat and retrospectively change the v1 API in a backwards incompatible manner, exploiting the fact that synapse is the only actively federating server implementation currently and that disabling /
escaping looks superficially like canonical JSON. I'd be concerned that even then the library isn't actually necessarily outputting canonical JSON, however (e.g. for representing floating point numbers in the right format, or whatever).
This is then a problem in that: a homeserver implementation which skips the redacted fields by cherrypicking the string buffer will pass the signature (despite the JSON not being in canonical form). But a homeserver implementation (like today's Synapse) which skips the redacted fields by parsing the JSON, deleting the fields, encoding it as canonical JSON and calculating a signature, will fail the signature. This is obviously a problem.
I don't even see a reason for canonical JSON in the first place
Okay, I see where you're coming from: ideally the sender could use whatever non-canonical form they liked, sign it, and the receiver would just look at the string and check the signature and move on.
However, this doesn't currently work because we have to skip the fields which aren't covered by the signature - and have to define the operation of skipping them in a manner that both sender & receiver can consistently perform. This is what we're using canonical JSON for. The problem is complicated further in that the fields covered by the signature vary from event type to event type.
So yes, this could be solved potentially by ditching canonical JSON entirely, and changing the formats of PDUs and events such that the fields not covered by the signature are always stored outside the signed block of JSON. Or alternatively, we could mandate canonical JSON throughout.
@richvdh: wdyt about just outputting federation PDUs as canonical JSON (i.e. switching from ujson to the plain python json library, which the spec implies can create canonical JSON if you pass it the right options)? I don't believe it should break anything, as this is just the format it uses on the wire? Or is synapse pulling pre-rendered JSON out of the DB?
if the federation API receives data which is invalid JSON then it should of course reject it.
What is invalid JSON if not invalid relative to canonical JSON? Because if it's invalid relative to RFC 7159 JSON then the spec is indeed clearly defining a deterministic transform from RFC JSON to canonical JSON and that's the answer to my question. It's not ideal, but if it is deterministic then I can write a satisfying workaround.
Only because coincidentally Synapse's JSON library happens to output something that is coincidentally quite close to the canonical JSON representation. If it happened to reorder the keys, change the whitespace, alter numeric field representations etc then you'd be completely out of luck.
There's good reason for the federation specification to have routers outright reject frames that are formatted that way. Note my word choices here: "frames," "routers" -- I'm going to sound patronizing but I just want all the federation protocol contributors to keep in mind you're in the internet message passing business here. The elegant protocol should be conscious of what makes things tick in that realm. Those needs really aren't the same as the client-server realm.
I wholeheartedly agree with @richvdh that this is not remotely a big deal
This is a big deal. The efficient network device is so because it's able to reuse large portions of the exact same data it receives as the exact same data it sends. The server never has to take a (valid) content
field in an event and copy it into a string buffer removing all of the escapes because it's just going to have to add the same escapes when it sends that same content out to destinations. So we don't do that. That's what makes network servers fast. And the parser is customized specifically for this. What's a big deal here is that this issue involves discrepancies deep within payload content, and that's something which, beyond its validity, a router should be agnostic to.
However, this doesn't currently work because we have to skip the fields which aren't covered by the signature - and have to define the operation of skipping them in a manner that both sender & receiver can consistently perform. This is what we're using canonical JSON for. The problem is complicated further in that the fields covered by the signature vary from event type to event type.
I think you could accomplish what you need in the specification without specifying a JSON grammar. You can specify how the signature
field gets removed -- that's fine. You can specify what event content fields can be included -- that's fine. None of that needs a strict custom canonical grammar. My personal opinion though is that even though you can accomplish those goals without canonical JSON you should force strict adherence to what you have, and ultimately it's a better choice.
@ara4n I'd also like to thank you for your kind reception here even though I'm probably being read as a bit animated on this subject. I know that you're dealing with a lot of issues higher up in the stack but I'm trying to convey how this little tiff over this single little /
character way down here is what determines the characteristics of that whole stack. How's the expensive parser going to take to when I add m.char
message types to the protocol so you can watch another party actually type text or pair-edit a live file in a room? Are you going to wish your messages were passed just a little faster then?
I'm not a core team member but let me give my 2 ct. I guess you cannot indicate how much time per event can be saved on skipping JSON canonicalization and how much this is close to being a bottleneck in the whole chain or not; so the argument about "faster" is a bit too theoretical so far. I agree with the core team members - as of right now, it's looks rather a matter of convenience, than a critical NFR, that the thing on the wire should or need not match bit-by-bit the payload that's been signed. If you wish, applying such restriction now is premature optimization which I, if I were in the core team, wouldn't spend time on yet. From the performance perspective, it might be reasonable to first find out bottlenecks.
Just had a quick RL chat with @erikjohnston about this.
The benefits of canonical JSON basically boil down to a trade-off between:
I'm not aware of any language (other than possibly perl6?) whose common JSON parsers decorate the parsed datastructure with the byte offsets of the source string - unless you're using a custom low-level JSON parser in C/C++ like the one @jevolk is describing. Which makes me feel that option 1 above (i.e. keeping canonical json) is the wiser choice for now, and we should consider mandating canonical JSON for the PDU to make life easy for @jevolk and similar (especially if it's just changing the forward-slash escaping).
However, the more I think about this, the more it feels like that rather than bodging around JSON's idiosyncrasies we'd be much much better off solving the problem properly and jumping ahead to a compact format which already has a canonical form, and over whom signatures can be calculated without messing around doing a semi-parse for curly brackets. Like CBOR. Or protobufs. Or capnproto. Or whatever.
TL;DR: we should consider making synapse spit out canonical JSON as a hack onto the v1 spec. For v2 spec let's just jump to a sensible efficient format and encoding and never speak of this bug again...
gah, github was showing me a cached copy of the page. @jevolk: i totally see where you are coming from, by asking us to focus on the performance and efficiency of the lowest level of the stack to provide a solid basis for everything else. And you're right that from a network routing/framing perspective this is all hilariously inefficient. However, our aethos is not to prematurely optimise, but to get something working (which it isn't yet, at some levels of the stack), and then make it work fast. We can ratchet the federation protocol to entirely different transports/formats/encodings as needed (my personal preference would be to use CBOR over WebRTC data channels) as desired, and I can't wait to get there. But prioritising that over P1 critical bugs like https://github.com/vector-im/riot-web/issues/2996 would be misguided at best.
Now, if you want to help us out and propose a super-efficient well-designed v2 federation transport, please go for it :)
@richvdh points out that if we ditch canonical JSON, we'd be forcing everyone to store events in their raw string representation (as well as their parsed datastructure form) everywhere, so that they can be passed around whilst preserving the signature validity. Which feels nonideal, and a clear reason to keep calculating signatures over a canonical form.
@ara4n
implementations are free to parse the JSON on the wire into RAM and before compiling to canonical form to check the signature if they want
I just want to highlight something from my original post for consideration here:
The verifier must then only refer to their corrected message as a valid verified message.
Once you compile the canonical JSON from your whatever other JSON library, and then verify that canonical JSON, you can no longer refer to the other JSON library's message in any way. Because the other library's message doesn't contain the signer's message, it contains the adversary's exploit of your reduction to canonical verification to inject their own information. Data first loaded into any other JSON library, especially in higher level languages with significant amounts of hidden state behind the values exposed in the language, is a message you didn't actually receive.
This means on platforms where there's a non-canonical JSON library used for the HTTP stack and then a canonical JSON library used for verification, you have to stringify the canonical representation and then parse it back through the non-canonical.
I think the conclusion of this epic was that we should use canonical form.
Thanks for summarizing, I wasn't about to take a day out to read all that.
Doing so means that the receiver doesn't need to re-canonicalise the JSON before checking its signature. However, it then means that receivers would need to handle the PDU as both string-representation and JSON. @richvdh: what were your other reasons for not mandating that some APIs use canonical JSON form?