angr / cle

CLE Loads Everything (at least, many binary formats!)
BSD 2-Clause "Simplified" License
392 stars 113 forks source link

CLE fails to load Android apps if classes referenced in the manifest are not part of the classes.dex #344

Open Alexeyan opened 1 year ago

Alexeyan commented 1 year ago

Hi team.

I'm trying to load an Android app in angr and it gave me the following error message: (CLE & angr version: 9.2.18. Most recent on Pip)

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~/[FOOBAR]/angr_solver.py:59
     57 logging.getLogger("angr.factory").setLevel("DEBUG")
     58 logging.getLogger("angr.manager").setLevel("DEBUG")
---> 59 main()

File ~/[FOOBAR]/angr_solver.py:44, in main()
     41 print(f"Using Sample:\t\t{sample_path}")
     42 print(f"Using Entrypoint:\t{entry_point}")
---> 44 project = angr.Project(sample_path, main_opts=loading_opts, auto_load_libs=True)
     46 # project.hook(SootMethodDescriptor(class_name="java.lang.String", name="valueOf", params=('int',)).address(), Dummy_String_valueOf())
     48 entry = project.factory.entry_state()

File ~/.local/lib/python3.10/site-packages/angr/project.py:138, in Project.__init__(self, thing, default_analysis_mode, ignore_functions, use_sim_procedures, exclude_sim_procedures_func, exclude_sim_procedures_list, arch, simos, engine, load_options, translation_cache, support_selfmodifying_code, store_function, load_function, analyses_preset, concrete_target, eager_ifunc_resolution, **kwargs)
    136     l.info("Loading binary %s", thing)
    137     self.filename = str(thing)
--> 138     self.loader = cle.Loader(self.filename, concrete_target=concrete_target, **load_options)
    140 # Step 2: determine its CPU architecture, ideally falling back to CLE's guess
    141 if isinstance(arch, str):

File ~/.local/lib/python3.10/site-packages/cle/loader.py:133, in Loader.__init__(self, main_binary, auto_load_libs, concrete_target, force_load_libs, skip_libs, main_opts, lib_opts, ld_path, use_system_libs, ignore_import_version_numbers, case_insensitive, rebase_granularity, except_missing_libs, aslr, perform_relocations, load_debug_info, page_size, preload_libs, arch)
    131     self._main_opts.update({'arch': arch})
    132 self.preload_libs = []
--> 133 self.initial_load_objects = self._internal_load(main_binary, *preload_libs, *force_load_libs, preloading=(main_binary, *preload_libs))
    135 # cache
    136 self._last_object = None

File ~/.local/lib/python3.10/site-packages/cle/loader.py:689, in Loader._internal_load(self, preloading, *args)
    687     l.info("Skipping load request %s - already loaded", main_spec)
    688     continue
--> 689 obj = self._load_object_isolated(main_spec)
    690 objects.append(obj)
    691 objects.extend(obj.child_objects)

File ~/.local/lib/python3.10/site-packages/cle/loader.py:871, in Loader._load_object_isolated(self, spec)
    868 # STEP 4: LOAD!
    869 l.debug("... loading with %s", backend_cls)
--> 871 result = backend_cls(binary, binary_stream, is_main_bin=self.main_object is None, loader=self, **options)
    872 result.close()
    873 return result

File ~/.local/lib/python3.10/site-packages/cle/backends/java/apk.py:96, in Apk.__init__(self, apk_path, binary_stream, entry_point, entry_point_params, android_sdk, supported_jni_archs, jni_libs, jni_libs_ld_path, **options)
     94     self.components = {'activity': [], 'service': [], 'receiver': [], 'provider': []}
     95     self.callbacks = {'activity': [], 'service': [], 'receiver': [], 'provider': []}
---> 96     self._set_lifecycle(apk_parser)
     97 else:
     98     self.components = None

File ~/.local/lib/python3.10/site-packages/cle/backends/java/apk.py:116, in Apk._set_lifecycle(self, apk_parser)
    114 for key, getter in component_getter.items():
    115     class_names = getter()
--> 116     self.components[key], self.callbacks[key] = self._extract_lifecycle(class_names, key)

File ~/.local/lib/python3.10/site-packages/cle/backends/java/apk.py:133, in Apk._extract_lifecycle(self, cls_name, component_kind)
    130 callbacks = []
    132 for cls in cls_name:
--> 133     components.append(self.classes[cls])
    134     callbacks.extend(self.get_callbacks(cls, callback[component_kind]))
    136 return components, callbacks

KeyError: 'com.foo.bar.SomeActivity'

I also think I know where the issue comes from: The class com.foo.bar.SomeActivity is referenced in the APKs Manifest and therefore returned by pyaxmlparser on https://github.com/angr/cle/blob/e2776162d623441f9d1e3070bb344cb148b0e32b/cle/backends/java/apk.py#L109, but is not part of the apps classes.dex and therefore not found in the list of classes at https://github.com/angr/cle/blob/e2776162d623441f9d1e3070bb344cb148b0e32b/cle/backends/java/apk.py#L133.

The missing class is part of a second stage payload that gets decrypted at runtime. Since the app runs fine on devices and emulators the Dalvik Vm seems to be able to support this, meaning it's not a malformed apk. I'm happy to assist in debugging/testing fixes but am unable to share the apk. I'm also happy to write a PR, but want to get an opinion on how this should be fixed first.

rhelmot commented 1 year ago

So I don't know anything about android, and the people who do know about android seem to have no desire to work on this.

That said! My understanding of the situation is that the metadata refers to an entity which is not populated until some code gets executed? If that's the case, we should probably re-architect the apk loader to not try to dereference those references during the main loading sequence, and then provide methods to do the dereferencing which can be called from Project, after it does the necessary emulation.