Analysis time of a function is only added up and never reset if the analysis update is not user update

xusheng6 commented 1 day ago

While working on https://github.com/mandiant/capa/issues/2406#issuecomment-2490171179, I noticed that the MLIL of function 0x467464 can normally be retrieved after a few seconds. But when binja was used in capa, it cannot retrieve the IL, thus leading to a crash.

Normal run:

>>> from binaryninja import *
>>> bv = binaryninja.load('/Users/xusheng/Downloads/112f9f0e8d349858a80dd8c14190e620.exe_.')
>>> func = bv.get_function_at(0x467464)
>>> func
<func: x86@0x467464>
>>> func.mlil
<MediumLevelILFunction: x86@0x467464>

It does take func.mlil a few seconds to run, but everything works as expected. We need to see why it is not working in the case of capa

xusheng6 commented 1 day ago

112f9f0e8d349858a80dd8c14190e620.exe_.zip

xusheng6 commented 1 day ago

Also see 0x8091b80 in b5f0524e69b3a3cf636c7ac366ca57bf5e3a8fdc8a9f01caf196c611a7918a87.elf_.zip

xusheng6 commented 1 day ago

capa stack trace:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/xusheng/capa/capa/main.py", line 1103, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/xusheng/capa/capa/main.py", line 994, in main
    capabilities, counts = find_capabilities(rules, extractor, disable_progress=args.quiet)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xusheng/capa/capa/capabilities/common.py", line 75, in find_capabilities
    return find_static_capabilities(ruleset, extractor, disable_progress=disable_progress, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xusheng/capa/capa/capabilities/static.py", line 168, in find_static_capabilities
    function_matches, bb_matches, insn_matches, feature_count = find_code_capabilities(ruleset, extractor, f)
                                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xusheng/capa/capa/capabilities/static.py", line 114, in find_code_capabilities
    for bb in extractor.get_basic_blocks(fh):
  File "/Users/xusheng/capa/capa/features/extractors/binja/extractor.py", line 58, in get_basic_blocks
    for mlil_bb in f.mlil.basic_blocks:
                   ^^^^^^
  File "/Applications/Binary Ninja.app/Contents/Resources/python/binaryninja/function.py", line 1039, in mlil
    raise ILException(f"Medium level IL was not loaded for {self!r}")
binaryninja.exceptions.ILException: Medium level IL was not loaded for <func: x86@0x8082d40>

And the Python script that works:

from binaryninja import *
bv = binaryninja.load('/Users/xusheng/Downloads/b5f0524e69b3a3cf636c7ac366ca57bf5e3a8fdc8a9f01caf196c611a7918a87.elf_.bndb')
func = bv.get_function_at(0x8082d40)
print(func.mlil)

xusheng6 commented 1 day ago

I checked the analysis skip reason for the function is ExceedFunctionAnalysisTimeSkipReason

But it still does NOT look right, because the analysis of that function takes like 2 seconds, there is no way the 20 seconds default time can be exceeded.

Update: I printed the analysis time and it is indeed questionable:

analysis skipped: 3, analysis info: {'Total': 0.002902333}
analysis skipped: 3, analysis info: {'Total': 0.0081085}

The analysis time of the two skipped functions are all very short

For comparison, I wrote a small script that iterates over all functions and retrieve the MLIL of each function. It took roughly 0.7 seconds to generate the MLIL for function 0x8082d40

xusheng6 commented 1 day ago

I made some interesting discovery in https://github.com/mandiant/capa/issues/2402#issuecomment-2491164752. However, that still does not explain why the analysis could time-out on a not-so-challenging function at all

xusheng6 commented 13 hours ago

OK so I finally figured out what is happening here. So we are only resetting the function analysis time if the analysis is a user update. If it is always auto-update, then we never reset the timer and the time just gets added up, eventually exceeding the 20 seconds threshold. Here is a brief explanation of what happens:

The script retrieves the IL of function 0x8082d40
It took 0.7 seconds to update the analysis and generate the IL
The IL of function 0x8082d40 is cached
The script requests the IL of some other 60 functions, exceeding the number of analysis.limits.cacheSize
The cached IL of function 0x8082d40 is discarded
The script retrieves the IL of function 0x8082d40 again
It took another 0.7 seconds to update the analysis and generate the IL. The two times are added up
Repeat the process many times until the total analysis time of 0x8082d40 is larger than 20 seconds
The analysis time of 0x8082d40 exceeds analysis.limits.maxFunctionAnalysisTime
The analysis bails out with a reason ExceedFunctionAnalysisTimeSkipReason

I understand we are probably only resetting the total analysis time on user updates for a good reason, and this cached-related repetitive generation of the IL is probably never encountered before

Vector35 / binaryninja-api

Analysis time of a function is only added up and never reset if the analysis update is not user update #6171