seclab-ucr / SyzGen_setup

MIT License
43 stars 13 forks source link

angr in SyzGen cannot trigger `mem_read` breakpoints in intel MacOS 11.5.2 #7

Closed Kiprey closed 2 years ago

Kiprey commented 2 years ago

The virtual machine OS that SyzGen runs on in paper is MacOS 10.15, which I have successfully reproduced.

For some reason I have to migrate SyzGen from MacOS 10.15 to MacOS 11.5.2.

While SyzGen is in the type inference phase, a breakpoint is added to the mem_read event in InferenceExecutor.pre_execute :

# https://github.com/seclab-ucr/SyzGen_setup/blob/release/SyzGen/syzgen/analysis/infer.py#L630
state.inspect.b('mem_read', when=angr.BP_BEFORE, action=onSymbolicRead)

This logic is correct. If there is a memory read action, angr can call the onSymbolicRead function to record some information.

However, while SyzGen adds a breakpoint on the mem_read event, in VM MacOS 11.5.2, angr cannot trigger this breakpoint even with multiple memory read events. Therefore, all interface information inferred by SyzGen is wrong.

Interestingly, this also happens in MacOS 10.15 when SyzGen infers interface for some drivers, although most of them work fine.

Is there any useful information that can help me to solve this problem?

Thanks!

CvvT commented 2 years ago

Hi @Kiprey,

Thanks for your interest! I'm not sure what you meant by "angr cannot trigger this breakpoint". Is that an issue caused by angr? It is possible that symbolic execution never reaches any memory read and thus the callback is never triggered. You could print out the trace (i.e., addresses of each executed basic block) to confirm it.

Weiteng

Kiprey commented 2 years ago

Hello, thank you for your kind reply!

"angr cannot trigger this breakpoint", take IONetworkUserClient as an example:

  1. in MacOS 10.15, when SyzGen is inferring IONetworkUserClient, every memory read actions in IONetworkUserClient::externalMethod can trigger onSymbolicRead function. That's pretty fine.
  2. However, in MacOS 11.5.2, when SyzGen is inferring the same UserClient, none of the memory read actions in IONetworkUserClient::externalMethod could trigger onSymbolicRead function. This really confuses me.

This function in both MacOS versions are pretty similar and all have memory access instructions.

I'm not sure if this issue is caused by angr, but I haven't seen questions elsewhere about mem_read breakpoints failed. Maybe this strange problem is caused by Angr executing at the kernel level?

Thanks!

CvvT commented 2 years ago

It's hard for me to tell what could go wrong based on your limited information. I would suggest you check the executed blocks to ensure symbolic execution indeed reached any memory read. For angr, running kernel code is no different from userspace programs except for some privileged instructions that may not be handled properly.

Kiprey commented 2 years ago

I agree with you, but I have checked this issue many times. To better illustrate the problem I met, I reproduced the problem and prepared two screenshots.

The first is prepared on MacOS 10.15.4:

10 15 4

The left part is output of SyzGen in type inference, right part is IDA view of IONetworkUserClient::externalMethod. From left part, we can see that Angr has captured mem_read and called the onSymbolicRead function, hence the "on Memory Read" message is output. That's pretty fine.

However, look at the snapshot prepared on MacOS 11.5.2:

11 5 2

The output on the left does not contain anything related to "on Memory Read", so angr does not call the onSymbolicRead function when doing type inference.

As you can see, the assembly code on the both snapshots is almost identical.

I think these two snapshots could make the whole issue clear. :)

This problem is really difficult to solve, so I would like to hear your advice and I will try it again.

Thanks for your help!

CvvT commented 2 years ago

I see. The issue is actually caused by the following line:

if state.addr > 0xffffff8000000000:
        return

For macOS 10.15.4, all driver code is loaded below 0xffffff8000000000 and I simply used this simple strategy to distinguish between driver code and the core kernel. You have to come up with a new way to do that or just remove the check.

-Weiteng

Kiprey commented 2 years ago

Thanks, it really works!

This address 0xffffff8000000000 is hardcoded into the codes. I'd like to ask how to get this special address in other version of MacOS, I want to replace it globally to prevent strange errors.

I think we could get the addresses of all the drivers by using lldb showallkmods command. However, I still can not get the address that distinguishes between the driver and the kernel in KASLR environment.

The problem is almost solved, thanks for your kind help!

CvvT commented 2 years ago

If you can get the address ranges of all modules including the core kernel, you can just check if the target address is within the modules we want to analyze.

Kiprey commented 2 years ago

OK. This issue has been resolved. Thanks for your kind help!