Vector35 / binaryninja-api

Public API, examples, documentation and issues for Binary Ninja
https://binary.ninja/
MIT License
888 stars 198 forks source link

IL not available for mach-o binaries #5765

Closed rand-tech closed 1 month ago

rand-tech commented 1 month ago

Version and Platform (required):

Bug Description:

IL is not loaded (always).

Steps To Reproduce:

class ...:
    def __init__(self, binary_view: "BinaryView")
        self.bv = binary_view
        # ...
    def wow(self, call: ReferenceSource):
        if not call.function.llil_if_available:
            logger.debug("@%#x LLIL not available", call.address)
            return []
        return str(call.function.llil) # example
        # ...

    def analyze(self, ...):
        # ....
        func: List["Function"] = self.get_funcs()
        for ref in func.caller_sites:
            res = self.wow(ref)
            print(res)

Every ref results into LLIL not available.

Expected Behavior: LLIL avilable for most cases.

Binary:

Binary is Mach-O universal binary. The binary does not contain objc info.

Additional Information:

This code worked completely fine before the update.

xusheng6 commented 1 month ago

Thanks for the bug report. Is it possible to provide with us a binary to make it easier to reproduce it?

rand-tech commented 1 month ago

I don't want to share the raw contents here, so how about this?

  1. I encrypt the binary into a zip
  2. I share a link which will expire after 1 dl
  3. You confirm the hash
  4. I share the password if you downloaded.

That way, I don't need to share the binary, and only you can download the file.

If you have a better way to share files privately, please let me know.

xusheng6 commented 1 month ago

I think you can join our public slack and share the file with me in private. You do not need to upload it here. Our public slack is at https://slack.binary.ninja/. Alternatively, you can also email it to us via binaryninja@vector35.com

negasora commented 1 month ago

How are you loading the binary_view that gets passed to that class?

rand-tech commented 1 month ago
with binaryninja.load(args.binary) as bv:
    bv.update_analysis_and_wait()
    analyzer = Analyzer(bv)
    analyzer.analyze()
rand-tech commented 1 month ago

FYI: I shared the files with xusheng6 on slack 20 minutes ago.

xusheng6 commented 1 month ago

V35 folks should search for Fallen-Spider-Rod3-Fresh-Various to find the relevant files

zznop commented 1 month ago

For larger binaries, Binja doesn't run full analysis on every function during load. When you're using the UI you'll notice that analysis for certain functions is triggered as you navigate to different locations in the binary. If you open the binary that you shared with us (don't navigate anywhere) and run bv.get_function_at(0x2b42c).llil_if_available you'll likely see it return None. If you run bv.get_function_at(0x2b42c).llil, it should return the LLIL. The later triggers BN analysis of that function.

>>> print(bv.get_function_at(0x2b42c).llil_if_available)
None
>>> print(bv.get_function_at(0x2b42c).llil)
<LowLevelILFunction: x86_64@0x2b42c>

I recommend trying something like this:

try:
    call.function.llil
except ILException:
    logger.debug("@%#x failed to get LLIL", call.address)
    return []

It's interesting that you didn't run into this until 4.1. Perhaps you were seeing it previously, but just not with near as many functions?

rand-tech commented 1 month ago

Thanks. Is there a way I can force analyze it?

rand-tech commented 1 month ago

btw, my current BN implementation only works with 1 file I provided (IDA works fine with both files). Specifically, BN fails on x86_64, but works with aarch64.

zznop commented 1 month ago

I suppose if you wanted to force analysis of all functions you could do something like below (albeit it's kind of hacky). Depending on the size of the binary it could take a while and use a lot of memory. This is why it's recommended to query IL as needed.

for func in bv.functions:
    try:
        func.llil
    except ILException:
        log_warn(f"Couldn't get LLIL for {hex(func.start)}")
        pass
rand-tech commented 1 month ago

I still have exceptions with the following code

try:
    call.function.hlil
except ILException:
    logger.debug("Couldn't get LLIL for %#x", call.function.start)
    return []
❯ cat bn/parser_share.py.log|grep "Couldn't get LLIL for "|sort|uniq -c
 112 DEBUG:__main__:Couldn't get LLIL for 0x10000d...

I tried fixing it by increasing the function size but i still get the error. (For comparison, the default value is 65536)

s = Settings()
s.set_integer("analysis.limits.maxFunctionSize", 600500000)
river-li commented 1 month ago

The first way @zznop mentioned should work.

Just replace your function from

    def wow(self, call: ReferenceSource):
        if not call.function.llil_if_available:
            logger.debug("@%#x LLIL not available", call.address)
            return []
        return str(call.function.llil) # example

to

    def wow(self, call: ReferenceSource):
        try:
            llil = call.function.llil
            return str(llil) # example
        except ILException:
            logger.debug("@%#x LLIL not available", call.address)
            return []

If you want to force analyze a skipped function, you can set analysis_skipped to False

if function.analysis_skipped:
    function.analysis_skipped = False
zznop commented 1 month ago

Couldn't get LLIL for 0x10000d...

0x10000d isn't in range of any of the segments for either of the two binaries you shared with us. Are you running your code against another binary?

rand-tech commented 1 month ago

Full address is 0x10000deb8. I redacted the last digits. Sorry for the confusion.

zznop commented 1 month ago

Full address is 0x10000deb8. I redacted the last digits. Sorry for the confusion.

Binja bails during analysis of the function because the time to analyze that function exceeds the default threshold of 20 seconds. If you increase the "Max Function Analysis Time" setting to something higher (like 60000 milliseconds) then Binja analyzes it fine. You can also force analysis by setting func.analysis_skipped to False.

To increase the timeout threshold for function analysis:

bv = BinaryView.load(filepath, options={"analysis.limits.maxFunctionAnalysisTime": 60000})

To force analysis of a skipped function:

func = bv.get_function_at(0x10000deb8)
if func.analysis_skipped:
    log_debug("Forcing analysis of skipped function at {hex(func.start)}")
    func.analysis_skipped = False

To summarize, it is expected that when you query IL for certain functions that ILException will be raised. When this occurs you can still force analysis using the method described above (in most cases).

xusheng6 commented 1 month ago

According to the discussion, things are working as expected