airbus-seclab / cpu_rec

Recognize cpu instructions in an arbitrary binary file
Apache License 2.0
647 stars 59 forks source link

Add Renesas H8S to corpus #4

Closed trou closed 5 years ago

trou commented 5 years ago

Dell distributes microcode files for the SH7757 BMC, which include a H8S-2117A.

See attached files h8s.zip

LRGH commented 5 years ago

I don't see any attached file.

trou commented 5 years ago

Indeed... weird. I'll send it over email next week if Github doesn't work.

trou commented 5 years ago

Updated with file

LRGH commented 5 years ago

It is not obvious where the instructions are in these files. For example in bridge7757.mot.bin the non-zero zone starting at octet 0x100 is not recognised as h8s300 or h8s300a by IDA. Neither the one starting at 0x400. And there is no obvious data section (no character strings, for example). Do you have more information on the content of these files (e.g. the entry point)?

LRGH commented 5 years ago

Using the h8sxa decoder from IDA seems coherent with an entry point at 0x100. The H8S-2117A looks like a 32-bit H8SX cpu and not a 16/32-bit H8S. But https://www2.renesas.cn/cn/en/products/software-tools/tools/compiler-assembler/compiler-package-for-h8sx-h8s-h8-family.html confirms that the 2117 are H8S and not H8SX.

trou commented 5 years ago

In IDA7.2, I can disassemble bridge7757.mot.bin as Processor : h8368 (H8/3687), which give the following disassembly at 0x100:

ROM:0100 ; =============== S U B R O U T I N E =======================================
ROM:0100
ROM:0100
ROM:0100 sub_100:                                ; CODE XREF: ROM:015C↓p
ROM:0100                                         ; ROM:02FA↓p
ROM:0100                 mov.l   #0xFFC000, sp   ; Move data
ROM:0106
ROM:0106 loc_106:                                ; CODE XREF: sub_100+2E↓j
ROM:0106                 mov.l   #0xFFC000, sp   ; Move data
ROM:010C                 jsr     sub_16632:24    ; Jump to subroutine
ROM:0110                 jsr     sub_16652:24    ; Jump to subroutine
ROM:0114                 jsr     sub_59E2:24     ; Jump to subroutine
ROM:0118                 mov.w   r0, r0          ; Move data
ROM:011A                 bne     loc_120:8       ; Branch if not equal
ROM:011C                 jsr     sub_1664E:24    ; Jump to subroutine
ROM:0120
ROM:0120 loc_120:                                ; CODE XREF: sub_100+1A↑j
ROM:0120                 bclr    #0, @byte_C401:16 ; Bit clear
ROM:0126                 mov.w   #0xA501, r0     ; Move data
ROM:012A                 mov.w   r0, @word_D000:16 ; Move data
ROM:012E                 bra     loc_106:8       ; Branch always
ROM:012E ; End of function sub_100

A quick look gives:

LRGH commented 5 years ago

OK. I have almost the same result disassembling at 0x100 with h8sxa using IDA 7.0.

LRGH commented 5 years ago

0x210c -> 0x1671e from this file has been added to the corpus. The other file is detected as H8S, which is not suprising because they come from the same source.

It would have been better to confirm this addition to the corpus with an independent source of H8S binaries, but I don't have one. It would also be nice to know if other variants (e.g. H8SX) are recognised as H8S or are different.

I made some quick tests and the addition of H8S does not seem to pertubate the detection of other architectures. Please tell me if some other binaries are wrongly detected as H8S.

trou commented 5 years ago

https://gcc-renesas.com/ seems to have support for many Renesas controllers.