OleksiiOleksenko / SpecFuzz

A tool for detecting Spectre vulnerabilities through fuzzing
Other
35 stars 14 forks source link

Error when running SpecFuzz #7

Closed enlighten5 closed 4 years ago

enlighten5 commented 4 years ago

Hi Oleksii,

I found the following errors when running SpecFuzz on one binary built by myself: [SF] Error: Signal handler called outside speculation. Do you have any idea about what this error means?

Also, I was wondering:

  1. How to save the generated test cases into the disk. I tried --save_all options with honggfuzz but it does not work.
  2. If we can save the test cases, is there a way to find the test case that triggers the vulnerability detected and saved in aggregated.json?

Thank you in advance! Zhenxiao

OleksiiOleksenko commented 4 years ago

Hi,

This error means that the program had an exception while executing a normal, non-speculative path. Simply put, it means something went wrong. In most cases, it's a bug in the compiler pass.

I'm going to need more information to help you. What was the program and how did you build it?

How to save the generated test cases into the disk. I tried --save_all options with honggfuzz but it does not work

They are supposed to be saved by default. Could it be that you're looking into a wrong directory?

If we can save the test cases, is there a way to find the test case that triggers the vulnerability detected and saved in aggregated.json?

Unfortunately, this feature is not implemented yet. The only way to find the necessary test case is to re-run the program with all the generated test cases until you find the one that triggers the vulnerability.

enlighten5 commented 4 years ago

Hi,

I realized that error means something went wrong with my program, and I fixed it.

I compiled a jsmn fuzz driver and run SpecFuzz with it. I ran for 1 hour and 8 hours but it seems like the test cases were not saved into the disk. I assume the seeds will be saved into the current directory, right? By the way, the only modification I made is changing decoding from default(utf-8) to ISO-8859-1 in analyzer.py, because there is unicode decoding error when using utf-8. So I added this line of code in collect_data() : sys.stdin.reconfigure(encoding='ISO-8859-1').

Then I use the JSON test cases from this test suite: https://github.com/nst/JSONTestSuite.git. I found that 17 addresses were marked as vulnerable in aggregrated.json (controlled: true) and 6 of them with fault count above 200. However, among those 17 detected locations, some addresses are actually pointing to mov instruction. Should that be considered as uncontrolled or false positive? I assume the marked instruction should be cmp.

Thank you so much for your time Zhenxiao

OleksiiOleksenko commented 4 years ago

I assume the seeds will be saved into the current directory, right?

Not necessarily - it depends on your HonggFuzz flags. If you pass it -f corpora_directory, then the test cases will be stored into this directory. See here for more details.

By the way, the only modification I made is changing decoding from default(utf-8) to ISO-8859-1 in analyzer.py, because there is unicode decoding error when using utf-8. So I added this line of code in collect_data() : sys.stdin.reconfigure(encoding='ISO-8859-1').

Could you please open a separate issue for this? It's a bug in SpecFuzz, I need to fix it.

However, among those 17 detected locations, some addresses are actually pointing to mov instruction. Should that be considered as uncontrolled or false positive? I assume the marked instruction should be cmp

On the contrary, all of the reported instructions should in some way access memory. SpecFuzz detects speculative invalid memory accesses, and that's what it reports. More specifically, it reports an invalid memory access and a sequence of mispredicted branches that led to this access.

enlighten5 commented 4 years ago

Hi,

Thank you for your information.

I still have problems interpreting the results in aggregrated.json So in aggregrated.json, my understanding from the paper is that address is the speculative invalid memory access, and branch_sequences are the mispredicted branches. I checked the addresses which are marked as controlled, and found that most of the addresses in branch_sequences are mov rsp, qword [current_rsp], which I think is the instrumentation code. Could you please provide more insight into how to interpret the results?

Thank you!

OleksiiOleksenko commented 4 years ago

So in aggregrated.json, my understanding from the paper is that address is the speculative invalid memory access, and branch_sequences are the mispredicted branches.

Yes, that's correct.

I checked the addresses which are marked as controlled, and found that most of the addresses in branch_sequences are mov rsp, qword [current_rsp], which I think is the instrumentation code.

This should not happen, it's a bug. Looks like the address is off by a few instructions. Could you, please, provide a larger context? I.e., a snippet of assembly with one of these movs together with 5 instructions above and below it.

enlighten5 commented 4 years ago

Hi, Please find the aggregrated.json and the test driver via https://drive.google.com/drive/folders/1XIjIuduCVRJR1qQ8R54Ha42bK5eQX-sX?usp=sharing I use Binary Ninja to open the binary. For example, the address at line 357 in aggregrated.json is mov rsp, qword [current_rsp], which should be the mispredicted branch and line 284 is cmp dword [rbp-0x4], 0xffffffff, which should be the invalid mem access.

I think the address is off by a few instructions, which makes it hard to understand the results. I also find it hard to relate the assembly code to its source code, especially after adding ASAN and instrumentation. If possible, could you share some experience about how to locate the assembly code in the source code accordingly (to find the source code that is compiled to the assembly code )?

Thank you

OleksiiOleksenko commented 4 years ago

357 in aggregrated.json is mov rsp, qword [current_rsp]

That's indeed an issue. The reported address was supposed to be a few instructions below it. This bug does not influence correctness - the whole block of instructions has the same debug symbol and maps to the same location in the C code. But I agree with you, when manually analysing the assembly it could be confusing. I'll fix it.

284 is cmp dword [rbp-0x4], 0xffffffff

This address is reported correctly. Here, you have a memory access [rbp-0x4] and ASan detected that it is out of bounds.

I also find it hard to relate the assembly code to its source code, especially after adding ASAN and instrumentation. If possible, could you share some experience about how to locate the assembly code in the source code accordingly (to find the source code that is compiled to the assembly code )?

analyzer aggregate already does it for you. In aggregated.json you where supposed to have locations in the C sources instead of ??:0:0. Most likely, you do not have the locations because you compiled without debug symbols (-ggdb).

OleksiiOleksenko commented 4 years ago

I realized that error means something went wrong with my program, and I fixed it.

Ok, I'm closing this issue then.