ida_kernelcache is an IDAPython module for IDA Pro to make working with iOS kernelcaches easier. The module provides functions to:
__PRELINK_INFO
segment into a Python dictionary.__got
sections and stub functions in __stubs
sections.The main processing function is designed to be run before any manual analysis or reverse engineering. With the default settings, IDA tends to miss a lot of useful information in the kernelcache. These scripts help IDA along by leveraging the known structure of the kernelcache to automatically propagate useful information.
In addition to the stock functionality in the module, ida_kernelcache contains several scripts to make analyzing the iOS kernelcache easier. For example, you can use the scripts to autogenerate C structs used by a function.
Many of the techniques used in ida_kernelcache were developed for and borrowed directly from memctl.
ida_kernelcache has been tested with IDA Pro 6.95 on kernelcaches for iOS versions 10.1.1, 11.0, 11.2, 11.3.1, and 12.0 beta. Currently only Arm64 kernelcaches from iOS 10 and later are supported.
You need to already have a decompressed kernelcache file loaded into IDA. You can find the URL to download a particular IPSW from Apple online, and there are a number of public tools (including memctl) capable of decompressing the kernelcache.
In IDA, select "File" -> "Script file..." from the menu bar, then choose the ida_kernelcache.py
script in the main directory. This will load the ida_kernelcache module into the IDAPython
interpreter under the names ida_kernelcache
and kc
. In the IDAPython prompt, type
kc.kernelcache_process()
and hit Enter to start analyzing the kernelcache. This function performs
all the major analyses supported by ida_kernelcache. The function will run for several minutes as
IDA identifies and analyzes new functions.
ida_kernelcache will try not to overwrite user names for addresses. This means that if the
kernelcache has been manually analyzed prior to initialization with kernelcache_process
, the
results may not be as thorough because user-specified names may block automatic name propagation.
However, there's also no guarantee that ida_kernelcache won't mess up prior analysis, so if you do
decide to run kernelcache_process
on a kernelcache file which you've already analyzed, make a
backup first.
ida_kernelcache is meant to be loaded via ida_kernelcache.py
; the submodules in the
ida_kernelcache
directory are not meant to be loaded directly. However, ida_kernelcache exposes
the functionality of many of these submodules. Here is what each of them does:
ida_utilities:
This module wraps some of IDA's functions to provide an easier-to-use API. Particularly useful are
is_mapped
, read_word
, read_struct
, force_function
, and ReadWords
. is_mapped
checks
whether an address is mapped, and optionally whether it contains a known value. read_word
reads a
variably-sized word from an address. read_struct
reads a structure type into a Python dictionary
or Python accessor object, which makes parsing data structures much easier. force_function
tries
several tricks to convert an address into the start of a function in IDA. ReadWords
is a
generator to iterate over data words and their addresses in a range.
build_struct: This internal module contains utilities to automatically populate an IDA struct based on a sequence of accesses to the struct.
class_struct:
This module provides functions to generate IDA structs representing C++ virtual method tables and
classes. initialize_vtable_structs
scans the (symbolicated) virtual method tables and creates IDA
structs to hold virtual method pointers. initialize_class_structs
performs a data flow analysis
on the virtual methods to identify accesses to the fields of each class, then builds IDA structs to
represent the classes. Instructions that appear to reference a field are also converted into
structure offset references. See the module docstring for more details.
classes:
This module defines the ClassInfo
type that holds information about C++ classes in the
kernelcache and provides the function collect_class_info
to scan the kernelcache for classes and
populate the global class_info
dictionary with a map from class names to ClassInfo
objects. The
ClassInfo
type records the class name, the OSMetaClass instance, the virtual method table, and
the superclass name for each C++ class. Additionally, each ClassInfo
object stores references to
the superclass's ClassInfo
and the ClassInfo
of all direct subclasses, making it easy to
examine and traverse the class hierarchy. collect_class_info
also stores the set of all virtual
method tables in the global vtables
set.
data_flow: This internal module contains data flow operations used by the rest of ida_kernelcache.
kernel:
This module provides the base
and prelink_info
global variables. base
is the base address of
the kernel image (the start of the kernel's Mach-O header). prelink_info
is the parsed
__PRELINK_INFO
dictionary.
kplist:
This module provides the kplist_parse
function to parse kernel-style plists.
metaclass:
This module provides the function initialize_metaclass_symbols
which adds a symbol for each
known OSMetaClass instance.
offset:
This module provides the functions initialize_data_offsets
and initialize_offset_symbols
. The
former scans through the segments looking for pointers which can be converted into offsets. The
latter symbolicates offsets in the __got
section of each kext if the target of the offset has a
symbol.
segment:
This module provides the function initialize_segments
to rename IDA's segments to be more useful.
By default, IDA seems to create the segment names by combining a guess of the bundle identifier
with the Mach-O section describing the region. initialize_segments
extracts the true bundle
identifier from the __PRELINK_INFO
dictionary and renames each segment to include the bundle
identifier, Mach-O segment, and Mach-O section. This makes it possible, for example, to distinguish
between __TEXT.__const
and __DATA_CONST.__const
. This module also provides the function
kernelcache_kext
(re-exported at the top level) to determine the kext containing the specified
address (only on the old iOS 11 split-kext kernelcache format).
stub:
Many kexts in the kernelcache contain stub functions in a __stubs
section that jump to functions
in the kernel proper. Unfortunately, these stubs provide a barrier for propagating cross references
and type information. This module doesn't solve these problems, but it does make looking at stubs a
bit easier by automatically renaming stub functions so that the target function name is visible.
Stubs and their targets are forcibly converted into functions in IDA, which helps make the
functions in IDA line up with the functions in the original source code.
tagged_pointers: The new iOS 12 merged kernelcache format has the upper 2 bytes of each pointer tagged with an offset in order to chain the pointers together in a list. This module contains functions for processing and restoring those tagged pointers.
vtable:
This module provides many useful functions for working with virtual method tables, including
vtable_length
, convert_vtable_to_offsets
, vtable_overrides
, initialize_vtable_symbols
, and
initialize_vtable_method_symbols
. vtable_length
checks whether the specified address could be a
vtable and returns the vtable length. The generator vtable_overrides
enumerates the virtual
methods in a class which override virtual methods used by the superclass. The function
initialize_vtable_symbols
adds a symbol for the start of each identified vtable.
initialize_vtable_method_symbols
iterates through the overridden methods in each vtable and
propagates symbols from the superclass to the subclass. This is possible because most of the base
classes in IOKit are defined in XNU with relatively complete symbol information. Each method
override in the vtable of a subclass must conform to the same interface as the method in the
superclass, which means we can generate a symbol for the override by substituting the subclass's
name for the superclass's name in the virtual method symbol in the superclass. For example, if we
have no name for the virtual method at index 7 in the AppleKeyStore
class, but we know that the
virtual method at index 7 in its superclass IOService
is called
__ZNK9IOService12getMetaClassEv
, then we can infer that index 7 should be called
__ZNK13AppleKeyStore12getMetaClassEv
in the subclass. This technique can be used to symbolicate
most virtual methods in most classes.
The ida_kernelcache_reload.py
script is identical to ida_kernelcache.py
, except it forces the
ida_kernelcache
module and all submodules to be reloaded. It is mostly useful for development.
The scripts
directory contains scripts that use ida_kernelcache to perform some sort of analysis.
These scripts are too specific to be part of the main ida_kernelcache module, but they are useful
when reverse engineering the kernelcache. They include:
find_virtual_method_overrides.py: A script to find descendants of a class that override a virtual method containing the specified string. Matching overrides are printed to the console.
populate_struct.py: Populate fields for a C++ class or C struct by performing data flow analysis starting at the current address.
process_external_methods.py:
Process an IOExternalMethod
or IOExternalMethodDispatch
array into a standard form for use by
fuzzing tools.
If you are using the Hex-Rays decompiler, one of the more interesting features of ida_kernelcache is the automatic C++ class reconstruction, which will use the OSMetaClass information and data flow analysis to create IDA structs to represent the classes found in the kernelcache. These representations can dramatically improve the readability of the pseudocode representation. To learn more, see the post Reconstructing C++ classes in the iOS kernelcache using IDA Pro.
With iOS 12, Apple introduced a new kernelcache format on some devices. Among the changes, this new
kernelcache's kernel pointers are tagged to link them in a list, presumably to allow iBoot to slide
the kernel without the _PrelinkLinkKASLROffsets
data in the prelink dictionary. Trying to analyze
a stock kernelcache using this format in IDA is difficult due to the missing cross-references. See
the article Analyzing the iOS 12 kernelcache's tagged pointers for details.
If you just want to untag the pointers in the kernelcache without performing any additional
processing, run kc.tagged_pointers.untag_pointers()
.
Some of this functionality likely applies more broadly than just to Apple kernelcaches (for
example, vtable analysis and symbol propagation, or most of the functions in ida_utilities.py
).
Nonetheless, I've limited the import scope to just the ida_kernelcache
module because I have not
tested any of this on other types of binaries.
ida_kernelcache is released under the MIT license.
Much of the functionality in ida_kernelcache is borrowed from memctl, which is also released under the MIT license. Other sources are noted in the comments in the corresponding files.
Brandon Azad