Qabel / qabel.github.io

:octocat: The Qabel documentation repository. The technical stuff can be found at our github.io page.
https://qabel.github.io
Other
2 stars 10 forks source link

Add unencrypted version header to encrypted drop messages #29

Closed L-Henke closed 9 years ago

L-Henke commented 10 years ago

I think we should add unencrypted version information (and maybe also utilized encryption/hash algorithms) to the encrypted drop messages or we will end with an unchangeable drop message format.

Since the size of the encrypted AES key depends on the size of the RSA key used to encrypt the original AES key, the encrypted message format would change in an incompatible way when using RSA keys with a different size. Also the digest size varies with the utilized hash function. When we just concatenate all these values without a delimiter, a client could never identify a changed message format.

zuckschwerdt commented 10 years ago

Thats's a really bad idea. The message must not offer any hint to the content. The client will expect and gracefully handle incomprehensible messages, i.e. wrong key (different recipient). It's not that hard to "brute force" a small number of possible formats. Keep in mind this project is not about universal exchangeability. It's about anonymity and secrecy.

L-Henke commented 10 years ago

In my opinion "brute forcing" the message format is the worse idea. Lets assume just three possible encryption algorithms, three RSA key sizes, and three different hash algorithms and you would have to parse each message 27 times! And by definition, most messages aren't even addressed to you. How should that be possible on a mobile device?

Furthermore, stating the used encryption algorithm does not reveal any hint about its content, only how it is encrypted. I'm pretty sure you are familiar with Kerckhoffs's principle. A cryptosystem must be secure even when you reveal everything except the key.

Stating the encryption algorithm doesn't affect anonymity or secrecy in any way.

Brute forcing the encryption algorithms would just increase the workload for any legitimate client. For an attacker, parsing a message a couple of times is a really small effort compared to brute forcing the key.

Edit: Furthermore, utilizing just one combination of encryption algorithm, RSA key size, and hash function (as it is currently documented) is exactly the same amount of information as stating these values with an encrypted message.

In the Qabel storage we wouldn't need those information if we send them to recipients in a drop message.

zuckschwerdt commented 10 years ago

You didn't account for fingerprinting. Assume the choices to be ordered. There is a known preference in clients, e.g. "new version" format then "old version". In usual conditions the decryption will work within the first few choices. In addition hints to the expected message format can be cached if you are really concerned about CPU cycles. The protocol would be I/O-bound not compute-bound in any case.

Also note not to trust any incoming data. Every version field or Content-Length header will be used as an exploit or DOS eventually. Don't mandate parameters, better infer, check, and handle exceptions. There will be broken clients.

L-Henke commented 10 years ago

I absolutely accounted "fingerprinting", although I wouldn't have called it so. Thats why my example has only very few different methods. You could also use the length of the byte stream to filter out possible used algorithms, but this just over complicates everything!

But how does "the first x bytes identify the version and algorithms" has anything to do with exploits or DOS? If there is any relationship, this would also exist is the currently planned byte stream because both gets parsed by our client. There is also no influence on "broken clients" and checks and error handling is done anyway:

Scenario A: The message states used algorithms - the client uses these algorithms - a bug in one of these algorithms is triggered.

Scenario B: The message doest not state used algorithms - the client tries all algorithms - the same bug in one or these algorithms is triggered

Where is the difference?

Of course you should never trust incoming data, but why should we select appropriate algorithms, if this could be done by the clients (e.g. forks). The first x bytes could be used to announce selected algorithms and if our client doesn't recognized these, the data just gets dropped.

Scenario C: The message states used algorithms - the clients doesn't know the values - the messages gets dropped

Designing an message format and plan to detect changed versions just by trying to parse them in every possible way is IMHO an extremely miserable design. And to think that our selected algorithms will last forever or will suits everyones needs is also unrealistic.

And if you really try to tell me, that we are unable to parse x bytes in a secure way, we might work on the wrong project.

zuckschwerdt commented 10 years ago

Fingerprinting: given a reasonable set of choices evident an attacker can classify clients based on that. This needs to be avoided.

Exploit does not mean bug but rather weakness. If you expose a weak spot someone will exploit it. Handing the attacker meta-data is a weakness.

Broken clients will (inadvertently) violate the protocol. You need to be lenient. That's where inferring is smarter.

This is data about data (meta data) and in cleartext. We subscribe and advertise a no-metadata policy. Don't violate that on a whim.

I'm not here to settle this (or any) issues. I'm merely offering advice. Please read the argument again. Maybe discuss this on a monday.

jan-schreib commented 10 years ago

Right now I don't see a problem in @L-Henke suggestion. @zuckschwerdt if there is no unencrypted info in the beginning, and I'm the bad guy, I would just look it up in the documentation (remember Kerckhoffs' principle). You are right about the meta data though. @L-Henke I think we need to talk about that with @thechauffeur on monday.

thechauffeur commented 10 years ago

We will definitely talk about this on Monday.

cburkert commented 10 years ago

I agree with @L-Henke . BTW: Identifying participants based on the meta-data is (only) easier with explicit meta-data but nevertheless also possible if the attacker can gain the same data by trail and error.

L-Henke commented 10 years ago

@cburkert Only the recipient could gain these data by trial and error, because no one else could decrypt it. But the idea was, that normally almost all of (our) client are using the same version.

Another way to allow protocol changes without adding unencrypted meta data could be achieved by adding a small header which is also encrypted with RSA OAEP. Now we could change our protocol, ciphers, etc. and only the small RSA encrypted header would have to remain the same. If our selected RSA modulus is to small and gets broken, an attacker is only able to decrypt the headers, since we can switch to a more secure encryption for the payload. A resulting drop message would look like this:

RSA2048_OASP(Header)|RSA2048_OASP(AES key)|AES IV|Ciphertext|Signature

But an updated drop message could also look like this and we could still process it without a trial and error approach:

RSA2048_OASP(Header)|RSA_ECC_256_OASP(Twofish key)|Twofish IV|Ciphertext|Signature

A client now would only have to try to decrypt the header and continue processing on success. This way processing messages not addressed to this client would remain the same. Receiving and sending message would be a slightly increased workload due to the second RSA encryption.

cburkert commented 10 years ago

@L-Henke Of cause, my bad.

Wouldn't the asymmetric encryption of the header with a statically sized RSA-Key result into the need for multiple private keys? Considering 2048bit RSA to be weak some day, the user still would have to manage this 2048bit key for header encryption and another stronger key for signing and symmetric key decryption.

L-Henke commented 10 years ago

Good point.

aBothe commented 10 years ago

For pragmatic reasons I wouldn't add any further info to any message - for the primarily anticipated goal of getting Qabel beta-finished asap. For now, we may also use only one asymmectric dec/enc algorithm at the moment - or, let's say, in the official version - so there isn't any reason to provide too many info when there's no absolute use for it (atm!). Or have I missed some important points (again)?

aBothe commented 10 years ago

@Gottox's alternative: Opt-in-solution by reserving a bunch (16 or so) of bytes that are kept zero for now. These bytes may be encrypted as well(?).

ghost commented 10 years ago

We define an empty header with n bytes. For the beta we will use 32 bytes. @L-Henke will define it in the protocol.

cburkert commented 10 years ago

I noted the current state of discussion in the documentation. Shall we close this issue or use it for further discussions on header size and content?

thechauffeur commented 10 years ago

We have the version number. We need exactly one Byte for this. Everything else (e.g. the 32 Byte "header" above) can be defined later since it will be possible to distinguish between the versions. That means we do not need to reserve space for an (unused) header now.

cburkert commented 10 years ago

I removed the header reservation (the 32 byte stuff). Only one byte version field left.

thechauffeur commented 9 years ago

Done.