mandiant / capa

The FLARE team's open-source tool to identify capabilities in executable files.
https://mandiant.github.io/capa/
Apache License 2.0
4.08k stars 512 forks source link

Crash when analyzing large file with binary ninja backend #2249

Open xusheng6 opened 1 month ago

xusheng6 commented 1 month ago

Stack trace:

 Traceback (most recent call last):
  File "/home/[REDACTED]/.local/bin/capa", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/[REDACTED]/App/capa/capa/main.py", line 860, in main
    capabilities, counts = find_capabilities(rules, extractor, disable_progress=args.quiet)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/[REDACTED]/App/capa/capa/capabilities/common.py", line 75, in find_capabilities
    return find_static_capabilities(ruleset, extractor, disable_progress=disable_progress, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/[REDACTED]/App/capa/capa/capabilities/static.py", line 183, in find_static_capabilities
    function_matches, bb_matches, insn_matches, feature_count = find_code_capabilities(
                                                                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/[REDACTED]/App/capa/capa/capabilities/static.py", line 128, in find_code_capabilities
    for feature, va in itertools.chain(extractor.extract_function_features(fh), extractor.extract_global_features()):
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/[REDACTED]/App/capa/capa/features/extractors/binja/extractor.py", line 52, in extract_function_features
    yield from capa.features.extractors.binja.function.extract_features(fh)
  File "/home/[REDACTED]/App/capa/capa/features/extractors/binja/function.py", line 100, in extract_features
    for feature, addr in func_handler(fh):
                         ^^^^^^^^^^^^^^^^
  File "/home/[REDACTED]/App/capa/capa/features/extractors/binja/function.py", line 27, in extract_function_calls_to
    llil = caller.llil
           ^^^^^^^^^^^
  File "/home/[REDACTED]/App/BinaryNinja/binaryninja/python/binaryninja/binaryview.py", line 125, in llil
    return self.function.get_low_level_il_at(self.address, self.arch)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/[REDACTED]/App/BinaryNinja/binaryninja/python/binaryninja/function.py", line 1726, in get_low_level_il_at
    llil = self.llil
           ^^^^^^^^^
  File "/home/[REDACTED]/App/BinaryNinja/binaryninja/python/binaryninja/function.py", line 946, in llil
    raise ILException(f"Low level IL was not loaded for {self!r}")
binaryninja.exceptions.ILException: Low level IL was not loaded for <func: x86_64@0x23e750>

This happens because when analyzing large files, binary ninja does not always generate the IL for all the functions. The code should be improved to account for the situation and only try to access the IL if it is available. Furthermore, there should be an option to force binary ninja to generate the IL for all the functions, at the cost of longer analysis time and RAM usage

xusheng6 commented 1 month ago

A fix will be coming soon for it