seemoo-lab / frankenstein

Broadcom and Cypress firmware emulation for fuzzing and further full-stack debugging
Apache License 2.0
428 stars 65 forks source link

Triggering CVE-2019-18614 over-the-air #3

Open joy8023 opened 3 years ago

joy8023 commented 3 years ago

Hi, you are doing a good job!

In CVE-2019-18614, it is mentioned that "the heap overflow can also be triggered over-the-air by sending a few L2Ping packets exceeding 384 byte". I use l2ping to send large packets l2ping -i hci0 -s 600 -c 1 xx:xx:xx:xx, and with using hciconfig hci0 aclmtu 1021:8, but it fails to crash the dev board. Moreover, from wireshark it seems that the packet received has been divided to two smaller packets. I am wondering if I miss some configuration. If so, could you please give some hints? Thank you in advance!

Jing

jiska2342 commented 3 years ago

Hi Jing,

this looks a bit like the hciconfig command failed. Which Bluetooth chip are you using to attack the evaluation board?

Also, -c 1 is probably the wrong parameter for l2ping, since you need to have at least 3 packets in the heap and then getting one freed to trigger a heap overflow. Try to remove that limitation and let it run for a minute or so to get more elements on the eval board's heap allocated :)

My setup was as follows.

You can also trigger the opposite buffer by connecting to a headset or similar from the victim's host.

Hope that clarifies the attack, otherwise feel free to ask more questions :) AFAIK it is still unpatched, but should be patched soon. Cypress claimed that none of their customers were using a setup on that particular chip that would use the protocols that exceed the buffer.

Best, Jiska

joy8023 commented 3 years ago

Hi Jiska,

Thank you for your response. We have tried two setups as follows, but fails in both cases. Could you please take a look? Thank you!

1. Raspberry Pi 3 as the device to attack the evaluation board (eval board) which attaches to a Ubuntu 18.04 desktop

It seems to me that we have the correct hciconfig, but the host (ubuntu + eval board) reset connection for some reasons, which stops it from receiving more pings to crash the heap.

2. We attach the eval board to a Ubuntu 18.04 desktop and ping from the sendng (original) BT chip to the eval board

It looks good and does not have the "connection reset" issue as in the Rasp Pi. However, we are not able to trigger the heap overflow vulnerability after pinging the eval board for around two minutes (the eval board does not crash)

I am also confused by the inconsistency between the ACL packet size between sending side and receiving side. From the pcap shown above, the sending ACL packet is of size 613, but it is split into three 118 ACL packet. Maybe it is the reason why the attack fails in our experiments. Do you have any idea ?

I know it is a long issue/question and I am very appreciate your patience. It would be very nice of you if you can give us some suggestions. Thank you very much!

Respectfully, Jing

jiska2342 commented 3 years ago

Hi Jing,

I tried to set different ACL->Host and Host->ACL sizes on my CYW20735 board to fix this CVE. Then it got somewhat bricked, because there is insufficient RAM to fit these increased buffer sizes, apparently :) Thus, I can only use it after resetting it (press recover, keep recover pressed, press and release reset, release recover). This way, all patches are wiped. In the process of patching all kinds of things its MAC address was also renamed to AA:AA:AA:AA:AA:AA :D

Nonetheless, I tried to reproduce the heap overflow again and succeeded, but it was a bit tricky.

On the victim system:

On the attacking system:

I attached an example Wireshark trace. During the first round of pings I had MTUs as set by the system. I had to disconnect, change the MTU on the victim and then reconnect to trigger the heap overflow. In the end of the recording you can see the typical memory stack dump.

Hope that helps reproducing the issue, MTUs seem to be a bit tricky overall. As you can see in the log, the attacker is using an Intel chip.

joy8023 commented 3 years ago

Hi Jiska,

Unfornately, I still can not reproduce this CVE after several attempts, following the exactly same procedures as you sugguested. I really do not know what goes wrong :(

I do have some questions after seeing your wireshark trace. It would be very nice of you if you could help me out of this.

Q1: From my understanding, hciconfig hci0 aclmtu 1021:8 only changes the configuration for the host side. As a support to my assumption, I look at the Wireshark trace and check if a specific HCI command is sent by host to controller to adjust the ACL buffer size. However, there is none. I know I am not very familiar with these, so please correct me if I am wrong. Then comes with my second question as Q2.

Q2: If hciconfig hci0 aclmtu 1021:8 does not change the configuration of the controller, why does 'attach controller -> resetting aclmtu -> launch remote attack' work? More specifically, why does resetting aclmtu change the ACL buffer size in controller (when receiving ACL pkt from the remote side)? From the example Wireshark trace, I observe that even after resetting the aclmtu, the received ACL packet is fragmented into two packets (344B and 274B). And the crash happens. I attach the screenshot below. It seems to me that controller is still receiving packets smaller than 389, but somehow the packets crash the controller.

image

Q3: From the screenshot above, I also observe that the localhost echo back ACL packet with 613B payload. Given that this buffer misconfiguration affects both directions, I am wondering that if echoing back large packet is real reason of crashing. I am also curious about where the misconfigured buffer locate (LMP or baseband)? Can we just limit the size of incoming/outgoing ACL packet to prevent it from being crashed? It would be nice if you could provide more information.

I understand that it has cost you lots of time and really appreciate your help! Thank you!

Respectfully, Jing

jiska2342 commented 3 years ago

Hi Joy,

regarding Q1, that's why I wrote that I set it on the victim device for this trace. And this might probably also be the problem here. As well as some weird behavior that on top either the attacker or the victim device's Bluetooth daemon might remember the last MTU for a device.

It originally happened to me in a completely different setting. I was using some Ubuntu or Debian machine with the Gnome Bluetooth connection manager with a software version from September/October 2019 and connected my Bose QC 35 II headset. All I wanted to do was taking memory snapshots for Frankenstein containing active connections for fuzzing. And then, instead of getting a snapshot, I always got crashes. Back then I did not manipulate the MTU in any way, I just used system default settings. I also did this on my travel laptop, which I wipe regularly, so I'm pretty sure there was no weird configuration. On that configuration, the L2Ping triggered as well, and using another evaluation board worked without crashes.

The firmware crashes in various places, either during emulation or while using it, whenever there is an allocation larger than the maximum buffer size of 384 bytes, and there are several hardcoded, unchecked memcpys with 1021 bytes. On other devices this doesn't matter, as the maximum buffer size is larger than that.

As you mentioned in Q2/Q3, this can also be on the way back and not over-the-air. Which, in the case of an L2Ping doesn't matter, as it is echoed back. And connecting to a headset is as well an action triggered by the host. You might be right that this issue doesn't trigger directly over-the-air and just indirectly. It definitely also triggers during emulation with Frankenstein in some scenarios.

It should be possible to limit the incoming/outgoing ACL size to prevent this crash. The modem controls the over-the-air ACL size, and LMP should just be sent encoded similarly as ACL over-the-air. Since the modem part is in hardware, you cannot change its behavior. If it splits the packets correctly, which it appears to do according to the traces, this part should be secure. The host, however, can change its behavior. You can limit it by setting the MTU correctly. So, as long as you trust this setting, this should be secure as well.

Triggering Bluetooth vulnerabilities over-the-air is always a lot of work, and since this one only affected one evaluation board, I did not invest the time to analyze it in as much detail or write a PoC. Sorry for that.

jiska2342 commented 2 years ago

Short bump on this thread, since the pool sizes again triggered weird behavior.

Setup

Triggering the Bug

# btattach --speed 3000000 -B /dev/ttyUSB0
# hciconfig hci0 up
(attach Wireshark)
# hciconfig hci0 reset

During the reset, one can observe that the board reports an incorrect ACL buffer size of 1040 (packet 10 in the attached log).

Bluetooth HCI Event - Command Complete
Event Code: Command Complete (0x0e)
Parameter Total Length: 11
Number of Allowed Command Packets: 1
Command Opcode: Read Buffer Size (0x1005)
Status: Success (0x00)
Host ACL Data Packet Length (bytes): 1040
Host SCO Data Packet Length (bytes): 64
Host Total Num ACL Data Packets: 20
Host Total Num SCO Data Packets: 1

1040 bytes do not fit into any of the block pool buffers.

> info heap
[*]   [ Idx ] @Pool-Addr  Buf-Size  Avail/Capacity  Mem-Size @ Addr
[*]   -----------------------------------------------------------------
[*]   BLOC[0] @ 0x200498:       48    33 / 36           1872 @ 0x211610
[*]   BLOC[1] @ 0x2004BC:       96    20 / 20           2000 @ 0x211D60
[*]   BLOC[2] @ 0x2004E0:      268     9 / 10           2720 @ 0x212530
[*]   BLOC[3] @ 0x20D344:      384     4 /  4           1552 @ 0x212FD0
[*]   BLOC[4] @ 0x20D368:      384    16 / 16           6208 @ 0x2135E0
[*]   BLOC[5] @ 0x20D38C:      264    15 / 15           4020 @ 0x214E20

When I only connect one device, nothing happens. This is a bit weird, no idea why. When I connect an iPhone 7 and use the Linux host as a Bluetooth speaker for music playback, and then connect a Nexus 5, BlueZ generates an SDP response that is 681 bytes (packet 581 in the attached trace), and 3 packets later one can already see the Frankenstein sanitizer output:

[*] Firmware says: Heap Corruption Detected
[*] Firmware says: Prehook
[*] Firmware says: dynamic_memory_sanitizer_lr = 0x02dee5
[*] Firmware says: dynamic_memory_sanitizer_r0 = 0x21134c
[*] Firmware says: dynamic_memory_sanitizer_r1 = 0x211380
[*] Firmware says: dynamic_memory_sanitizer_r2 = 0x07
[*] Firmware says: dynamic_memory_sanitizer_r3 = 0x05c4
[*] Firmware says: pool = 0x20d368
[*] Firmware says: pool->block_start = 0x2135e0
[*] Firmware says: pool->capacity = 0x10
[*] Firmware says: pool->size = 0x0180
[*] Firmware says: *free_chunk = 0x0400090f

So, for whatever reason raw L2CAP pings are not working (any more?) to trigger this bug. BlueZ is working correctly given that the CYW20735 chip claims it has an ACL packet length of 1040 bytes upon reset. I'm aware that the SDP response is sent from the CYW20735 to the remote device, so I'm not sure if this example is exploitable over-the-air, but at least it's an example that triggers reliably with my setup. Also, it does not require changing the MTU manually etc. This is definitely a bug where behavior of BlueZ as well as the CYW20735 influence crash behavior in combination.

Example capture where the ACL buffer is exceeded on a CYW20735 while connecting other devices over-the-air.

jiska2342 commented 2 years ago

After fixes by Cypress, the heap has two more entries, such that the 1040 will no longer overflow:

[*]   [ Idx ] @Pool-Addr  Buf-Size  Avail/Capacity  Mem-Size @ Addr
[*]   -----------------------------------------------------------------
[*]   BLOC[0] @ 0x200498:       48    31 / 36           1872 @ 0x213800
[*]   BLOC[1] @ 0x2004BC:       96    20 / 20           2000 @ 0x213F50
[*]   BLOC[2] @ 0x2004E0:      268     9 / 10           2720 @ 0x214720
[*]   BLOC[3] @ 0x20D344:      384     4 /  4           1552 @ 0x2151C0
[*]   BLOC[4] @ 0x20D368:      384    16 / 16           6208 @ 0x2157D0
[*]   BLOC[5] @ 0x20D38C:      264    15 / 15           4020 @ 0x217010
[*]   BLOC[6] @ 0x20D3B0:      264    15 / 15           4020 @ 0x217FC4
[*]   BLOC[7] @ 0x213838:     1040     2 /  2           2088 @ 0x2237F0