nymtech / nym

Nym provides strong network-level privacy against sophisticated end-to-end attackers, and anonymous transactions using blinded, re-randomizable, decentralized credentials.
https://nymtech.net
1.3k stars 236 forks source link

Redesign chunking system #294

Open jstuczyn opened 4 years ago

jstuczyn commented 4 years ago

Recently the more I've been doing anything to do with chunking, the more I get the feeling it should be rewritten and simplified taking into consideration its current uses and how everything evolved from when it was initially created.

When I initially came up with the algorithm, all messages were sent fully in plain, there were no acks, reply-surbs, etc and it all was relatively simple. My goal then was to squeeze as many bytes into a sphinx packet as possible. So, for example, I've bothered to combine what should have been a separate flag and simple 32bit ID value into a 31 bit ID and 1 bit flag. I've created two different types of [Fragment]s (linked/unlinked), etc. This added a lot of extra complexity for what in hindsight was not a lot of gain. Sure, we saved one byte of message, but then added extra few hundred when we introduced SURB-acks. And any changes to [Fragment] are prone to causing issues elsewhere due to those decisions.

Another indication of the system being due for a rewrite and what actually caused me to write this issue is [FragmentIdentifier], i.e. something that uniquely identifies given [Fragment]. It is the previously mentioned 31 bit ID and a 1 byte position inside a [FragmentSet]. In principle this sounds good. However, then we introduced SURB-ACKs which were acking particular chunk using its [FragmentIdentifier]. So far this is fine. However, because no message should be distinguishable, the SURB-ACKs must also be present for cover traffic. So what is a "correct" value of [FragmentIdentifier] for a cover message? I've applied a "hacky" solution here and used what is normally not a valid value, i.e. ID of 0 and position of 0. This solution worked fine and I forgot about it. But then came reply-SURBs which added yet more complexity as ACKs for them behave in yet another manner. Similarly to "cover-acks" we do not re-transmit anything after timeout, however, unlike "cover-acks" we would want to know if we actually received the ack. And so I had to come up with yet another [FragmentIdentifier] that is normally invalid, but would "work" just for reply-acks. (I'm thinking about something for this while writing the issue, my current idea is pseudorandom ID, but 0 position). And it annoyed me. I had to use a "hacky" approach for something that should be natively supported. After all we are writing the entire Nym system from scratch...

So my goals of the redesigned chunking system would include, but not be limited to:

jstuczyn commented 4 years ago

Ideally it should also allow for "pausing" and "resuming" chunking so that we could make it into Stream and Sink. And/or allowing for being constructed from an AsyncRead source. Something similar-ish to what tokio codec is doing