marin-m / vmlinux-to-elf

A tool to recover a fully analyzable .ELF from a raw kernel, through extracting the kernel symbol table (kallsyms)
GNU General Public License v3.0
1.3k stars 127 forks source link

Support uImage format and/or manual arch specification #3

Open skochinsky opened 4 years ago

skochinsky commented 4 years ago

First, congrats on the awesome tool.

I decided to try it out and went to the OpenWRT release archive. The first one alphabetically was ARC and it failed:

  File "C:\Work\git\vmlinux-to-elf\vmlinux_to_elf\architecture_detecter.py", line 157, in guess_architecture
    raise ValueError('The architecture could not be guessed successfully')
ValueError: The architecture could not be guessed successfully

In fact, the uImage header already includes the architecture, load address and even entrypoint:

openwrt-18.06.4-arc770-generic-uImage: u-boot legacy uImage, ARC OpenWrt Linux-4.9.184, Linux/DesignWare ARC, OS Kernel Image (Not compressed), 4522192 bytes, Thu Jun 27 12:18:52 2019, Load Address: 0x80000000, Entry Point: 0x8000A000, Header CRC: 0xA11EF4A4, Data CRC: 0xAC4BE39B

Additionally, there is no need to know the architecture if not writing out the ELF file (e.g. when just dumping symbols), so this step could be skipped until required. You could also let user specify it manually or just write 0 to e_machine.

Note: uImage format may employ its own compression (seen at least gzip used).

marin-m commented 4 years ago

Hello,

Kudos for your work on IDA too.

I can see multiple things that I could improve from your post:

In the end, it is possible that the best would be to add generic flags for information that are not 100 % sure to be inferred exactly by the tool (--kernel-offset, --base-address, --e-machine, --bit-size), even though the detection works well with my corpus of kernels.

I should get back at this soon. Other ideas are welcome.

Regards,

marin-m commented 4 years ago

Hello,

For your information, your kernel now reconstructs well without extra arguments. Also, I have added support for the extra arguments that I have mentioned in the previous message. These have been documented in the README.md.

Regards,

skochinsky commented 4 years ago

Thanks!

FYI found an example of a compressed uImage which seeems to be not handled out-of-box: openwrt-18.06.4-lantiq-falcon-lantiq_easy98000-nand-squashfs-sysupgrade.bin Also openwrt-18.06.4-ramips-rt305x-3g-6200n-initramfs-kernel.bin

However no symbols found even after manual decompression :( Making an ELF with just code section may be useful although without .bss the analysis will not be too great...