mumble-voip / mumble

Mumble is an open-source, low-latency, high quality voice chat software.
https://www.mumble.info
Other
6.41k stars 1.12k forks source link

Packet loss caused by #4227 #4719

Closed TerryGeng closed 3 years ago

TerryGeng commented 3 years ago

In #4227, @Johni0702 referred to https://eprint.iacr.org/2019/311.pdf and pointed out that OCB2 is vulnerable to a forgery attack, which can be further lead to a plain text recovery on certain ciphertexts.

To briefly outline the attack described in this article (might not be accurate since I am not an expert):

  1. Forgery attack: this attack requires certain blocks of the plaintext are zeros. With a sufficient amount of such blocks being captured, attackers can use the ciphertexts of such blocks to construct a mapping, therefore he can use this method to forge ciphertext that would be accepted by the decrypting side.

  2. Plaintext Recovery: A further step would lead to the recovery of such blocks.

But these attacks only apply to certain types of blocks, and the key is not revealed in the middle, therefore the attackers can not forge any ciphertext they want, or decrypt whatever block at their wish. There are other restrictions and the possibility of success is not necessarily large.

Mumble is susceptible to both attack if the hacker appears as a middleman, he will have a chance to intercept certain blocks and thereafter forge some packets to the vulnerable clients. He can also decrypt certain types of blocks sent by the senders.

Possible workarounds include avoiding sending texts ending with zeros in their last-second block or instruct the client to drop block with this construction, therefore, avoiding possibly processing forged text.

4227 adapts the first strategy, that is, avoiding sending texts ending with zeros in their last-second block. But it appears not sufficient given that it is still vulnerable to forged packets (old clients would send vulnerable packets, and they might be intercepted by attackers and being used to produce forged packets of this construction). Further review of #4227 shows it did implement a receiver-side check to prevent this situation. Nonetheless, it drops even more packets.

Furthermore, it seems that Opus will pack silence into a bunch of zeros, and such packets are dropped by this mechanism, causing confusing packet loss and audio artifacts, as described in #4385.

A preliminary patch was made in #4716, aiming at preventing near-silence packets to be transferred. But it seems that opus has the ability to encode a wide range of audio signals into silence, so a simple gating based on amplitude is not sufficient. And such gating would also cause possible timing issues.

But if we step back a little bit, given that the attacks are very limited and don't really reveal the key, it won't cause many short term negative effects. Especially, the plaintext recovery of those vulnerable packets will in most cases, gave simply the attacker zeros packet. The forgery is more troublesome, but given that the structure of opus packets are more complicated, the possibility of successfully forging meaningfully opus packet might not be that large, not to say forging a series of packets successfully, therefore, producing meaningful voice clips.

Personally, I would suggest redo #4227 based on the arguments above. We are working on new encryption schemes. We should mark OCB2 as obsolete in the future. Trying to fix OCB2 or make an OCB2-specific workaround in other parts of the codebase would possibly not the best choice, given that it would unnecessarily affect users using other encryption methods in the future.

An example is TLS. TLS 1.0 and 1.1 were discovered to be insecure. But people didn't try to fix it or make sophisticated workarounds, but simply warn users about the possible danger and publish alternatives. I think since mumble is not as popular and important as TLS in the security field, such an approach should be appropriate.

FAQ:

  1. Mumble doesn't have alternative encryption schemes, so we have to fix OCB2. A: We are going to publish alternative encryption methods in 1.4.0. And, even if we provide fixes for OCB2, we still have to instruct people to upgrade to it. Then why not ask them to upgrade to 1.4.0 so they can use other encryption methods?

  2. A lot of old servers and old clients are still using OCB2, we can't cut them out. A: However, as long as they don't upgrade, they will still be vulnerable to the attacks mentioned above. We are going to instruct them to upgrade. Then, why not upgrade to 1.4.0 and use more secure encryption?

  3. Some new clients would have to connect to old servers, thus such a hotfix is necessary for these new clients to be protected. A: In real-world browsers, if people want to connect to TLS 1.0 websites, they will receive a warning. The real-world browsers don't really fix TLS 1.0.

  4. Mumble shouldn't use browsers as its role model. A: TLS is a much more serious protocol and is used to protect some really confidential communications. I think we are not security expert and people developing TLS is much more professional than us. I can see a reason why we shouldn't learn from them.

TredwellGit commented 3 years ago

https://github.com/mumble-voip/mumble/issues/4277?

TerryGeng commented 3 years ago

4277?

Sorry. Typo. Fixed.

blablablerg commented 3 years ago

the typo is still in the second to last paragraph.

TerryGeng commented 3 years ago

@blablablerg Sorry, fixed. Also added the FAQ section.

Krzmbrzl commented 3 years ago

Then why not ask them to upgrade to 1.4.0 so they can use other encryption methods?

Because some people don't have the possibility to upgrade. If you are fixed to a distribution's package archive, then you are out of luck since we don't have a working AppImage and I am not sure about Flatpak and/or Snap packages.

The real-world browsers don't really fix TLS 1.0.

But real-world browsers are provinding implementations of TLS > 1.0 for ages now. We don't even have an implementation for different encryption schemes let alone an ETA for 1.4.0

Imo the TLS analogy is kinda lacking since (as mentioned above) there are alternatives and thus it is fine to cut off the old protocols. We on the other hand don't have alternatives yet and as long as we don't have them (released) we can't argue that people should simply upgrade

TerryGeng commented 3 years ago

@Krzmbrzl Thanks for the comment :) Here are some replies

Because some people don't have the possibility to upgrade.

Then they have to bear with such vulnerabilities... As stated in my analysis, the fix must be bidirectional, involves both the servers and the clients. Any parties sending vulnerable packets will be affected by such vulnerabilities. It's not to say a "fixed" server can help vulnerable clients away from attacks. Vulnerable packets sent by weak clients can still be forged, if we don't reject them on the server. And we cannot even provide a "sine wave" solution for those clients since they can not upgrade. Their packets are just dropped.

We don't even have an implementation for different encryption schemes let alone an ETA for 1.4.0

Do we have an ETA for this OCB2 patch? And what can we do to let servers upgrade so they will no longer send vulnerable packets that would be dropped by a "fixed" client?

We on the other hand don't have alternatives yet and as long as we don't have them (released)

Then I would suggest we keep the current "fix" since it doesn't affect normal users but only a small proportion of people using continuous mode. But we can add a command-line option to disable this fix if the user wants to trade continuous mode with security vulnerabilities.

TredwellGit commented 3 years ago

we don't have a working AppImage and I am not sure about Flatpak and/or Snap packages.

Nix can be used to run Mumble on pretty much every Linux distro. https://nixos.org/download.html $ nix-shell -p mumble --run mumble

TredwellGit commented 3 years ago

Why don't we just add a proper force TCP mode option to the server?

TerryGeng commented 3 years ago

Why don't we just add a proper force TCP mode option to the server?

Because fast is kinda a selling point of Mumble and some people rely on UDP for low latency communication.

And also security is also a selling point. So breaking either will upset some people.

TredwellGit commented 3 years ago

doesn't affect normal users but only a small proportion of people using continuous mode

It affects all modes.

Johni0702 commented 3 years ago

4227 adapts the first strategy, that is, avoiding sending texts ending with zeros in their last-second block. But it appears not sufficient given that it is still vulnerable to forged packets (old clients would send vulnerable packets, and they might be intercepted by attackers and being used to produce forged packets of this construction).

That is incorrect. If either server or client has #4227, the described attack is no longer possible.

But it seems that opus has the ability to encode a wide range of audio signals into silence, so a simple gating based on amplitude is not sufficient.

I don't know how you can come to that conclusion. Your updated patch is just dropping the packet at different place, so ofc it would result in the same effect. As I said before, most of audio code is not real time and therefore you can't just drop samples without causing underruns or similar effects, even silent samples are important. As far as I could tell, this simple workaround which I proposed was working exactly as expected.

Having said that, I don't mind this bug not being fixed, what I do mind is making OCB2 vulnerable again.

Personally, I would suggest redo #4227 based on the arguments above.

Quoting myself:

I am fundamentally opposed to releasing insecure-by-default (or even by uninformed choice) software. Therefore I do not consider unpatched OCB2 to be an option. Just because I couldn't immediately think of any way to exploit the issue given Mumble's feature set at the time, doesn't mean that there isn't one or that there won't be one when new feature X will be introduced (and no one will double check after every new feature that OCB2 is still secure with it).

[...] the old client is still vulnerable.

The current release (1.3.3) is not (and it's even sufficient for either client or server to be using 1.3.3). Releasing a new version (1.3.4) without the patch, would make that newer version (and we can all agree that it's generally good to keep your software up-to-date) vulnerable while the older one was not. So at that point we would have released insecure-by-default software. Even if the new version is 1.4.0 and it supports and uses a new encryption method by default if the server supports it, it'll still be insecure by default if it falls back to unpatched OCB2 when used on an older server (and that is not the case with the current 1.3.3).

Then they have to bear with such vulnerabilities... As stated in my analysis, the fix must be bidirectional, involves both the servers and the clients. Any parties sending vulnerable packets will be affected by such vulnerabilities. It's not to say a "fixed" server can help vulnerable clients away from attacks. Vulnerable packets sent by weak clients can still be forged

Incorrect. See above or #4227.

TerryGeng commented 3 years ago

That is incorrect. If either server or client has #4227, the described attack is no longer possible.

Then what happens if the vulnerable packets sent by old servers were forged in the middle and being decrypted by even a new client?

In the article, https://eprint.iacr.org/2019/311.pdf, I quote:

The sender is modified to never encrypt a message where the second-last block is len(0^n) while the receiver remains unchanged, or the sender remains unchanged and the receiver is modified to never decrypt to a message where the last block would be of the form 2^mL ⊕ len(0^n)

But as for the client/server model of Mumble, a client is both a sender and a receiver. So is a server. Therefore, this patch must be applied on both encryption and decryption sides.

I don't know how you can come to that conclusion. Your updated patch is just dropping the packet at different place, so ofc it would result in the same effect. As I said before, most of audio code is not real time and therefore you can't just drop samples without causing underruns or similar effects, even silent samples are important.

Yes. So I concluded that patch was not a solution.

Johni0702 commented 3 years ago

Then what happens if the vulnerable packets sent by old servers were forged and being decrypted by even a new client?

I don't recall the full details from when I read the paper but I do recall that the attack requires an encryption and a decryption oracle, so taking away both, en- and decryption oracle, on either side is sufficient for such an attach to now lack the prerequisites. Please read the full paper or my summary in https://github.com/mumble-voip/mumble/issues/4219#issuecomment-636218461 if you want to get into the details (I don't, I've been there).

The sender is modified to never encrypt a message where the second-last block is len(0^n) while the receiver remains unchanged, or the sender remains unchanged and the receiver is modified to never decrypt to a message where the last block would be of the form 2^mL ⊕ len(0^n)

But we're doing both of those, see above.

TredwellGit commented 3 years ago

https://github.com/mumble-voip/mumble/pull/4716#issuecomment-766352683 does not fix it.

Johni0702 commented 3 years ago

#4716 (comment) does not fix it.

Did you try those on their own or in addition to the PR? They were meant to be on their own. Asking because if you did try them on their own, then I'm unable to reproduce the issue (and am out of simple ideas of how to fix it).

TerryGeng commented 3 years ago

Please read the full paper or my summary in #4219 (comment) if you want to get into the details (I don't, I've been there).

I really read this morning. I don't recall the attackers are actually required to interact with the oracles, otherwise, middleman attacks won't be possible.

But I have to admit it is impossible for me to fully understand it. Johni, do you have a cryptography background? If you do, then I would trust your expertise. Otherwise, we may need to consult other experts given this topic is not really a piece of cake for normal people.

But we're doing both of those, see above.

It is well possible that a new client is connected to an old server, so we are not doing both.

Johni0702 commented 3 years ago

Please read the full paper or my summary in #4219 (comment) if you want to get into the details (I don't, I've been there).

I really read this morning. I don't recall the attackers are actually required to interact with the oracles, otherwise, middleman attacks won't be possible.

There's nothing in middleman (in the cryptographic sense) which would imply that they may not require access to oracles. The term you're looking for is a passive attacker. If you're looking to find places in the paper where an oracle is accessed, they key word is "(encryption/decryption) query".

But I have to admit it is impossible for me to fully understand it. Johni, do you have a cryptography background? If you do, then I would trust your expertise. Otherwise, we may need to consult other experts given this topic is not really a piece of cake for normal people.

While I do not have a formal degree in anything related to cryptography, I have a strong interest into it, know the required mathematics fundamentals and have participated and passed two cryptography related lectures (one fundamentals and one blockchain). So, while I will still put a disclaimer in front of my summary, iirc I did feel like I really understood the inner workings of the described attacks at the time.

But we're doing both of those, see above.

It is well possible that a new client is connected to an old server, so we are not doing both.

No, we are doing both of those on the same side ("a client is both a sender and a receiver"). With Mumble, there's basically two one-way OCB2 communication channels, one server-bound and one client-bound. In the case of a 1.3.3 client, to prevent the attack on the server-bound channel:

The sender is modified to never encrypt a message where the second-last block is len(0^n) while the receiver remains unchanged

and to prevent the attack on the client-bound channel:

the sender remains unchanged and the receiver is modified to never decrypt to a message where the last block would be of the form 2^mL ⊕ len(0^n)

Same goes in reverse for old client and new server.

I don't recall the attackers are actually required to interact with the oracles, otherwise, middleman attacks won't be possible.

TredwellGit commented 3 years ago

I tested it separate from the pull request.

TerryGeng commented 3 years ago

Thanks for the explanation. While I don't have a degree in cryptography as well, I have a physics degree from one prestigious research university as well, so I think my critical thinking should apply as well.

There's nothing in middleman (in the cryptographic sense) which would imply that they may not require access to oracles. The term you're looking for is a passive attacker. If you're looking to find places in the paper where an oracle is accessed, they key word is "(encryption/decryption) query".

and to prevent the attack on the client-bound channel:

the sender remains unchanged and the receiver is modified to never decrypt to a message where the last block would be of the form 2^mL ⊕ len(0^n)

Same goes in reverse for old client and new server.

I don't recall the attackers are actually required to interact with the oracles, otherwise, middleman attacks won't be possible.

Yes. I understand. But please consider this:

As stated in the first post of mine, the attack is two-fold:

  1. Vulnerable packets sent by the sender can be intercepted, and then be forged in the middle, and then be passed to the receiver, but the receiver doesn't know the packet was forged.
  2. Vulnerable packets being collected by the attacker can be used to decrypt other packets, therefore leads to a plaintext revelation.

Considering the following scenario: An old server is connecting with a new client.

a) The fixed new client won't send vulnerable packets to the server. So both 1 and 2 won't happen. b) The unfixed old server sends a vulnerable packet to the old client and is forged in the middle, but the new client doesn't check the incoming packet, therefore doesn't know it is forged.

Does it mean the hotfix on the encryption side doesn't completely fix this problem?

EDIT: Or should I understand that: you fix is not targeting the scenario I described. In order to make your fix work, both servers and clients should be updated?

Johni0702 commented 3 years ago

but the new client doesn't check the incoming packet

It does. That's what the second part is about:

the sender remains unchanged and the receiver is modified to never decrypt to a message where the last block would be of the form 2^mL ⊕ len(0^n)

So, while the attacker would be able to technically forge a valid OCB2 packet, the (patched) receiver is no longer technically using OCB2 and will reject the forged packet (even though by OCB2 spec, it should accept it).

Does it mean the hotfix on the encryption side doesn't completely fix this problem?

Correct, which is why we not only have it applied to the encryption side but also to the decryption side. The former makes the forgery impossible, the latter makes the forgery useless, either one prevents the attack.

Johni0702 commented 3 years ago

I think as long as you secretly flip bits in the ocb_encrypt function, we are good. It wouldn't affect other parts of the codebase right?

What worries me is if there is a checksum mechanism in the compressed frame, but a brief glance through the RFC page shows there is no such mechanism.

After trying to wrap my head around Opus' internal structure (RFC6716), I believe the idea quoted above might actually be feasible (and doesn't require changes in any unrelated part of Mumble, only the OCB2 patch itself). As far as I understand Opus (specifically section 4.1, 4.3.0 and 4.3.3), the 0s are simply unused bits and are only there because we running Opus in constant bitrate mode. So, flipping one of them should have absolutely no effect on the resulting audio whatsoever.

I've implemented that workaround in #4720. @TredwellGit could you give that a try? It should definitely fix the packet loss but I'm not entirely sure it won't cause any artifacts on the receiver end (it did not for me, but I also do not seem to be able to reproduce 100% of the original issue).

MartB commented 3 years ago

@Johni0702 not sure if thats gonna be the case for everyone, but your commit allows me to stream music correctly on 1.4 without any weird packet loss

(Would be great to get a new windows snapshot build as soon as this goes in)

Edit: see https://github.com/mumble-voip/mumble/pull/4720#issuecomment-769381637 for a new assessment

TerryGeng commented 3 years ago

Sorry for not following this discussion for a few days.

Correct, which is why we not only have it applied to the encryption side but also to the decryption side.

If I understand you correctly, I think you mean both the server and client need to update to a fixed version in order to prevent this attack from both directions.

Anyway, thanks for the explanation and the fix.

Johni0702 commented 3 years ago

If I understand you correctly, I think you mean both the server and client need to update to a fixed version in order to prevent this attack from both directions.

No, that's literally the opposite of what I've been saying:

If either server or client has #4227, the described attack is no longer possible.

Let me try to break this down yet again. A one-way encrypted communication channel has a sender-side which encrypts data and a receiver-side which decrypts data. For voice, Mumble uses two, mostly independent, one-way encrypted communication channels pointing in opposing directions. E.g. channel A (the server-bound channel) has a sender-side on the client and a receiver-side on the server, channel B (the client-bound channel) has a receiver-side on the client and a sender-side on the server.

If the server has the patch, then its encryption (sender-side) and decryption (receiver-side) ends do not allow potentially vulnerable packets through. At this point, regardless of whether the client has the patch, for channel A the server's receiver will not accept packets which could have been forged and for channel B the server's sender will not send packets which could be used to create a forgery. So neither channel A nor B can be exploited, regardless of whether the client has the patch. Exact same thing also applies for when the client has the patch and the server does not.

TerryGeng commented 3 years ago

I'm sorry but please ignore this message. I'm reviewing #4227.

If I understand you correctly, you mean:

A patched server will:

  1. not send vulnerable packets,
  2. will not relay possibly forged packets received from unfixed clients.

This means a patched server will protect vulnerable clients. I think I have understood this very clear and thanks again for your explanation.

However, actually, my original question also includes the setup like a vulnerable server and patched clients.

The vulnerable server is still able to send vulnerable packets (sent by other unpatched clients connected to the server) that can be forged in the middle but the patched client, since it doesn't check, can not tell it.

To amend my previous comment, would you think it would be more precise to state this as: The sender-side-only fix will be effective if the server is updated, but the clients are not required to update. Meanwhile, if an updated client connects to an old server, this fix is not effective.

TerryGeng commented 3 years ago

I'm sorry about the fact that I missed your receiver-side fix in ocb_decrypt. You did check the packet when decrypting.

My previous memory was you only implemented a sender-side fix, which is inaccurate.

I have corrected my original post to reflect the correction of this error.

TerryGeng commented 3 years ago

I have received reports that some of my friends are having significant problems listen to the music bot on my 1.3.0 server. The glitch sound happens too frequently and it is intolerable for music streaming. (Which further proves @Johni0702 is right)

I have instructed them to downgrade.

Krzmbrzl commented 3 years ago

having significant problems listen to the music bot on my 1.3.0 server.

The mitigation is not even part of 1.3.0 though. It was introduced with 1.3.1. Or are they using newer clients?

TerryGeng commented 3 years ago

@Krzmbrzl Sorry, they are using 1.3.3 clients.