Redesign chunking system

Recently the more I've been doing anything to do with chunking, the more I get the feeling it should be rewritten and simplified taking into consideration its current uses and how everything evolved from when it was initially created.

When I initially came up with the algorithm, all messages were sent fully in plain, there were no acks, reply-surbs, etc and it all was relatively simple. My goal then was to squeeze as many bytes into a sphinx packet as possible. So, for example, I've bothered to combine what should have been a separate flag and simple 32bit ID value into a 31 bit ID and 1 bit flag. I've created two different types of [Fragment]s (linked/unlinked), etc. This added a lot of extra complexity for what in hindsight was not a lot of gain. Sure, we saved one byte of message, but then added extra few hundred when we introduced SURB-acks. And any changes to [Fragment] are prone to causing issues elsewhere due to those decisions.

Another indication of the system being due for a rewrite and what actually caused me to write this issue is [FragmentIdentifier], i.e. something that uniquely identifies given [Fragment]. It is the previously mentioned 31 bit ID and a 1 byte position inside a [FragmentSet]. In principle this sounds good. However, then we introduced SURB-ACKs which were acking particular chunk using its [FragmentIdentifier]. So far this is fine. However, because no message should be distinguishable, the SURB-ACKs must also be present for cover traffic. So what is a "correct" value of [FragmentIdentifier] for a cover message? I've applied a "hacky" solution here and used what is normally not a valid value, i.e. ID of 0 and position of 0. This solution worked fine and I forgot about it. But then came reply-SURBs which added yet more complexity as ACKs for them behave in yet another manner. Similarly to "cover-acks" we do not re-transmit anything after timeout, however, unlike "cover-acks" we would want to know if we actually received the ack. And so I had to come up with yet another [FragmentIdentifier] that is normally invalid, but would "work" just for reply-acks. (I'm thinking about something for this while writing the issue, my current idea is pseudorandom ID, but 0 position). And it annoyed me. I had to use a "hacky" approach for something that should be natively supported. After all we are writing the entire Nym system from scratch...

So my goals of the redesigned chunking system would include, but not be limited to:

should still support arbitrary long plaintext messages (or if it was limited, the limitation should not be practically ever reached, like if the limit was 1 TB or something stupid like that)
single [Fragment] type
as a consequence of above: constant size [FragmentHeader] in all situations
if any IDs are to be used, try to avoid any bit masking. Just use this extra byte and have proper u32, i32, u64, i64, u128, i128, etc.
easy and native way to infer if packet contains a cover message or a reply based on its [FragmentIdentifier]
perhaps also completely abandon concept of a [FragmentSet] and use different way of linking packets together?

nymtech / nym

Redesign chunking system #294