SuperHouse / esp-open-rtos

Open source FreeRTOS-based ESP8266 software framework
BSD 3-Clause "New" or "Revised" License
1.53k stars 491 forks source link

libnet80211.a & libwpa.a are of unknown origin #4

Open projectgus opened 9 years ago

projectgus commented 9 years ago

Creating this issue to track what we know about libnet80211.a & libwpa.a, which implement the (upper?) MAC layer and the WPA functionality.

The layer above is the IP stack in lwip, specifically the esp network interface code (in lwip/esp_interface.c). The layer below is radio-specific stuff in libpp.a.

The symbols in libnet80211.a match pretty closely to an older revision of FreeBSD's net80211 module, but it may be via a fork taken from FreeBSD somewhere: https://svnweb.freebsd.org/base/head/sys/net80211/?pathrev=234018

Some symbols in libwpa.a match closely to FreeBSD wpa, but it may come via somewhere else or have substantial Espressif additions (latter seems unlikely)? https://svnweb.freebsd.org/base/head/contrib/wpa/wpa_supplicant/?pathrev=252726

I'm hoping to analyse the source revisions from FreeBSD SVN to find the revision which are "closest" to the contents of the binaries, in terms of symbols.

TODO also is to analyse the exact connections between libpp.a to see where it interacts with libnet80211.a & libwpa.a.

pfalcon commented 9 years ago

wpa_supplicant upstream is here: http://w1.fi/wpa_supplicant/ . Note that some parts of wpa_supplicant code appear even in bootrom (well, those are more or less generic parts and may come form other origins).

pfalcon commented 9 years ago

I know that @israellot did some hacking/analysis on net80211 part, I invited him to stay in touch: https://github.com/israellot/esp-ginx/issues/2 , but so far didn't see him publish any details.

israellot commented 9 years ago

I've found this: https://github.com/linux-rockchip/linux-rockchip/tree/d08bfac2cc274fe8a0d9b9c6a640886fc5bb3857/drivers/net/wireless/esp8089 It's a driver for the esp8089 ( pretty much the same chip as the esp8266 ). I haven't been able to dig much but I hope this can help us understand better the 8266's internals. The libraries are heavy optimized which makes disassembler analysis quite hard as well. I did start a quest trying to find the FreeBSD branch used, with no luck, replicating the objects and trying to call some functions would always result in crash, meaning I've got the wrong signatures.

pfalcon commented 9 years ago

I've found this: https://github.com/linux-rockchip/linux-rockchip/tree/d08bfac2cc274fe8a0d9b9c6a640886fc5bb3857/drivers/net/wireless/esp8089

Well, esp8089 Linux driver was brought up during the early days of esp8266.com. I don't think it gives us much: it consists of a) firmware as a binary C array (eagle_fw1.h/eagle_fw2.h) and then Linux side which communicate with that firmware over SPI. So, esp8266 SDK gives us much already (we could develop such a firmware which allows to communicate with host side over SPI - but we can develop many more types of firmwares).

The libraries are heavy optimized which makes disassembler analysis quite hard as well.

Which libraries do you mean? If SDK, Xtensa is actually very nice, "classical" RISC architecture which doesn't have CISC-like crap as flags (like ARM has), and at the same time, doesn't have MIPS/SPARC style delay slots which are PITA for a human. So ironically, a good RISC assembly is actually more high-level than CISC stuff like x86.

I did start a quest trying to find the FreeBSD branch used, with no luck, replicating the objects and trying to call some functions would always result in crash, meaning I've got the wrong signatures.

Well, we should expect that whatever base was used for Espressif code, it was modified. So, we can't just expect that we can check each FreeBSD revision in a row until one "clicks". It won't happen. The talk is about finding a version which matches SDK code the best, then try to modify it to make it compatible with the rest of code. Finding "closest" revision isn't realistically doable by hand, but then automating it not trivial either. (Which makes it an interesting, challenging task, which is unfortunately a diversion from working on an open esp8266 SDK ;-) ).

All in all, @israellot , please consider sharing notes on the steps you already tried - it either will allow someone else to continue them, or come up with another approach, knowing that one is not fruitful.

israellot commented 9 years ago

A quick review of what I've done. It's not much but maybe it's a starting point. My goal was to discover the signature of this function reported in the symbols : ppTxPkt . It appears to me to be the central point for all packets sent over the air, regardless of ap association. Would be a key piece for making a open source 802.11 implementation.

I took this function as a starting point: ieee80211_send_probereq , on most FreeBSD branches it has the same signature ( here's an example https://github.com/freebsd/freebsd/blob/01e375543f2cca888435d33af45404f00296ca0c/sys/net80211/ieee80211_output.c ) :

ieee80211_send_probereq(struct ieee80211_node *ni,
    const uint8_t sa[IEEE80211_ADDR_LEN],
    const uint8_t da[IEEE80211_ADDR_LEN],
    const uint8_t bssid[IEEE80211_ADDR_LEN],
    const uint8_t *ssid, size_t ssidlen)

The only unknown is the * ieee80211_node* argument. It turns out the ESP has a basic config struct at the address _irom0_text_start+0xc, it appears everywhere on the disassembled code. Looking at the function eagle_lwip_getif,

40213200 <eagle_lwip_getif>:
40213200:       f38341          l32r    a4, 4021000c <_irom0_text_start+0xc>
40213203:       62cc            bnez.n  a2, 4021320d <eagle_lwip_getif+0xd>
40213205:       4428            l32i.n  a2, a4, 16
40213207:       32dc            bnez.n  a2, 4021321e <eagle_lwip_getif+0x1e>
40213209:       020c            movi.n  a2, 0
4021320b:       f00d            ret.n
4021320d:       0b1266          bnei    a2, 1, 4021321c <eagle_lwip_getif+0x1c>
40213210:       5428            l32i.n  a2, a4, 20
40213212:       228c            beqz.n  a2, 40213218 <eagle_lwip_getif+0x18>
40213214:       0228            l32i.n  a2, a2, 0
40213216:       f00d            ret.n
40213218:       020c            movi.n  a2, 0
4021321a:       f00d            ret.n
4021321c:       f00d            ret.n
4021321e:       0228            l32i.n  a2, a2, 0
40213220:       f00d            ret.n

it became clear the ROM has two ieee80211_node structures, one for the soft ap, the other for the station client. One is on offset 16 from the main config struct, the other on offset 20. Guessing the signature was easy :

 struct ieee80211_node * eagle_lwip_getif(int id);

Passing 1 returns a pointer to the access pointer node, 0 for the station ap. Using this pointer as first argument on the ieee80211_send_probereq function allows me to successfully inject a probe request packet.

A second step would be writing a function in C that replicates the ieee80211_send_probereq, basically creating a management frame using ieee80211_getmgtframe, configuring it and finally outputting the frame via ieee80211_mgmt_output which internally calls ieee80211_raw_output which in my opinion is replaced by ppTxPkt in ESP, so maybe the signature is similar. But I didn't go down that road so much. Help would be appreciated.

israellot commented 9 years ago

In my opinion, esp8266 future is unknown, insecure, to the point there's no much sense putting effort to reverse engineer so much of their internals. They might just dump 8266 and put 8277 in place next month. A serious SOC would start with an consistent and possible open source SDK. We like the ESP because it's cheap, period. But it comes with a price, not in dollars, that's is high. I think putting pressure on EspressIf is a better road, cause we know they can do better, and so do they. Better code, better support, better licenses, etc.

projectgus commented 9 years ago

Hi @israellot,

Thanks for posting your research, this is very interesting. Do you have any feeling on whether the fields in the ieee80211_node structures are the same as in the upstream mac80211? In lwIP the netif structure is a little bit modified for ESP.

You make good points about the future of ESP. This is one of the reasons for choosing "open source above the MAC layer" as an initial goal - aiming for something achievable and useful in a reasonable timeframe, without too much spelunking into the lower layers.

One hopeful thing is that whenever an "8277" eventually appears, it will probably have similar peripherals to 8266 so reverse engineering effort will probably not become obsolete overnight. Probably, anyhow!

Cheers again,

Angus

pfalcon commented 9 years ago

+1 for sharing that info.

Cannot help but to stick a shameless plug for https://github.com/pfalcon/ScratchABit

It turns out the ESP has a basic config struct at the address _irom0_text_start+0xc, it appears everywhere on the disassembled code.

40213200 <eagle_lwip_getif>:
40213200:       f38341          l32r    a4, 4021000c <_irom0_text_start+0xc>

So, the config struct is not at _irom0_text_start+0xc, its address is there. The best way to treat "l32r" instruction in the r/o code as an alias for "movi". ScratchABit makes it all obvious:

│4021c024        eagle_lwip_getif:                                                                                                                    │
│4021c024 4134de     movi*    a4, 0x3ffeac70 ; via 0x402138f4                                                                                         │
│4021c027 cc62       bnez.n   a2, loc_4021c031                                                                                                        │
│4021c029 2844       l32i.n   a2, a4, 0x10                                                                                                            │
│4021c02b dc32       bnez.n   a2, loc_4021c042                                                                                                        │
│4021c02d 0c02       movi.n   a2, 0x0                                                                                                                 │
│4021c02f 0df0       ret.n                             

(Disclaimer: with a patch ida-xtensa plugin, available on a request or later when time permits to push my changes.)

pfalcon commented 9 years ago

But it comes with a price, not in dollars, that's is high.

That's certainly true.

I think putting pressure on EspressIf is a better road, cause we know they can do better, and so do they. Better code, better support, better licenses, etc.

I don't know what kind of pressure you have in mind. I tried to "apply pressure" to resolve GPL-which-should-be-there situation, it was pretty obvious problem, and it took quite an effort (as for an individual, working on that in hist free time as a hobby). But then it was successful, and didn't take that much of an effort and time. That means they do listen and even care a bit. I imagine it would be much worse with a typical western corporation (but then a western corpo unlikely did it "wrong" in the first place). But that doesn't mean that either a western or chinese corpo would just go an open up on somebody's wish - put yourself at the place, what would you do?

pfalcon commented 9 years ago

One hopeful thing is that whenever an "8277" eventually appears, it will probably have similar peripherals to 8266 so reverse engineering effort will probably not become obsolete overnight. Although it might!

Well, there're rumors that Espressif might open up a bit more - eventually. One good reason for that would be that a new chip coming out, sufficiently different, so opening up old code no longer will be much of "IP loss" ;-).

israellot commented 9 years ago

You are right @pfalcon. It's a pointer, l32r is load realitive, I just messed up my writing. But I actually tried loading the pointers by hand and the result is the same.

israellot commented 9 years ago

Taking a closer look, the eagle_lwip_getif should in fact return a struct netif pointer. The fact that I can use this pointer in a mac function might indicate they use the same network description structure on lwip and mac. In other words, ieee80211_node and lwip's netif could be the same or one a subset of the other. Which adds to my guess that the mac code is far more custom made than the lwip.

pfalcon commented 9 years ago

@projectgus, @israellot : I've now cleaned up a bit my ida-xtensa changes described above and pushed to https://github.com/pfalcon/ida-xtensa ("pfalcon" branch). Once again, it's supposed to work with ScratchABit.

israellot commented 9 years ago

Great @pfalcon ! Let me take a better look and try it here. Thanks for you help proving this tool!

cnlohr commented 9 years ago

I don't know how useful this would be to you considering it still relies on the pp.a, and ppTxPkt, but I have jimmied the existing stack into sending arbitrary 802.11 packets of whatever kind I want.

https://github.com/cnlohr/esp8266rawpackets/blob/master/user/esp_rawsend.c

israellot commented 9 years ago

That's great @cnlohr ! Can you share your method as well? How did you proceed.

cnlohr commented 9 years ago

I was puzzled by your question, but I think you are asking "how did [you] figure it out?"

I spent about 40 minutes trying to reverse engineer the ppTxPkt function using ScratchABit. I tried calling it with many different paremeters, but it kept failing and rebooting the chip. Eventually I started to look for related functions, and figured "maybe ppRegisterTxCallback would pass a similar buffer" and in fact, it passed EXACTLY the parameter to ppTxPkt! So, I tried calling ppTxPkt with the same parameter I got from the TxCallback, and using wireshark in monitor mode, I was able to receive many, many SSID broadcasts. Once I saw that, I knew I had a winner, so I started printing all the bytes that were being passed in. When I saw pointers (4th byte = 3f) I followed them, When I saw what looked like an IEEE802.11 packet, I tried modifyingppRegisterTxCallbackc it, and that worked. Then, I had to figure out how to change the size. So, I kept watching packets intently looking for what changed when the size changed. I found one value, but that didn't seem to work to change, so I found another one too. By changing both values, I could change the Tx packet length.

All in all, it took may 4 hours to hack?

I am going to attempt to hack it to be able to use promiscuous mode while running as an AP, and I sort of have that now, but I can't seem to hook the original RX processor, and thus it doesn't seem to work right atm. :(

cnlohr commented 9 years ago

If possible, please move this discussion back to the ESP8266 boards

http://www.esp8266.com/viewtopic.php?f=6&t=3481