marin-m / vmlinux-to-elf

A tool to recover a fully analyzable .ELF from a raw kernel, through extracting the kernel symbol table (kallsyms)
GNU General Public License v3.0
1.37k stars 131 forks source link

Corrupt symbolization of Google Pixel 3 XL kernel factory image #44

Open U-Scripter opened 1 year ago

U-Scripter commented 1 year ago

First, I would like to thank you all for this amazing work.

Second, I would like to point out a bug (I think) I encountered. I tried running vmlinux-to-elf on a boot.img extracted from the Factory firmware downloaded from Google's image archive, and the generated elf file had offsets pointing at incorrect locations (but it does open as valid elf file in RE tools).

The exact image I used was for build number SP1A.210812.016.C1 for the Google Pixel 3 XL. You can download it from here (search for 67ea87fc in that page).

Reproduction

$ ./vmlinux-to-elf ./path/to/crosshatch-sp1a.210812.016.c1/boot.img ./path/to/symbolized/kernel.elf
[+] Kernel successfully decompressed in-memory (the offsets that follow will be given relative to the decompressed binary)
[+] Kernel successfully decompressed in-memory (the offsets that follow will be given relative to the decompressed binary)
[+] Version string: Linux version 4.9.270-g862f51bac900-ab7613625 (android-build@abfarm-east4-101) (Android (7284624, based on r416183b) clang version 12.0.5 (https://android.googlesource.com/toolchain/llvm-project c935d99d7cf2016289302412d708641d52d2f7ee)) #0 SMP PREEMPT Thu Aug 5 07:04:42 UTC 2021
[+] Guessed architecture: aarch64 successfully in 3.56 seconds
[+] Found relocations table at file offset 0x24ffc78 (count=245151)
[+] Found kernel text candidate: 0xffffff8008000000
WARNING! bad rela offset ffffff800af3e1b8
[+] Found kallsyms_token_table at file offset 0x01cb9c00
[+] Found kallsyms_token_index at file offset 0x01cba000
[+] Found kallsyms_markers at file offset 0x01cb8500
[+] Found kallsyms_names at file offset 0x01a37f00
[+] Found kallsyms_num_syms at file offset 0x01a37e00
[i] Negative offsets overall: 0 %
[i] Null addresses overall: 0 %
[+] Found kallsyms_offsets at file offset 0x01983ffc
[+] Successfully wrote the new ELF kernel to ./path/to/symbolized/kernel.elf

Actual Result

The binary has the symbols at invalid locations, which clearly visible through Ghidra's decompilation of most functions. For example, this is the binder_ioctl function: image image image

The assembly code does not have any resemblance of the actual binder_ioctl code and there are no calls to any other binder functions.

Expected Result

The binary should have the symbols at valid locations. This is an example of a valid binder_ioctl decompilation or a valid elf generated using your tool (for another kernel): image

Attempted Workarounds/Solutions

I tried:

I would also like to note that using the kallsyms-finder script did extract the symbols' addresses correctly on that same boot.img.

U-Scripter commented 1 year ago

This is the extracted kernel which results in the corrupt elf file: kernel.zip

I zipped it because Github does not allow .lz4 extensions.

philmb3487 commented 1 year ago

This unfortunately also happens with the boot image of the Lenovo tab M10 gen 2 codename X606F.

worstperson commented 1 year ago

When you get "WARNING! bad rela offset" apply_elf64_rela() exits early. It (or something else) normally inserts relative_base_address between kallsyms_offsets and kallsyms_num_syms__offset, but hasn't in this case, which causes find_kallsyms_addresses_or_symbols() to find an incorrect address for kallsyms_offsets.

A temporary solution here could be to comment out this line https://github.com/marin-m/vmlinux-to-elf/blob/master/vmlinux_to_elf/kallsyms_finder.py#L922C17-L922C46