How to go from reduced bugs to understanding the underlying cause?

comsec-group / cascade-artifacts

Artifacts for Cascade: CPU Fuzzing via Intricate Program Generation (USENIX Security 2024)

111 stars 7 forks source link

How to go from reduced bugs to understanding the underlying cause? #14

Closed hakase56557 closed 3 months ago

hakase56557 commented 3 months ago

Hello! Apologies in advance as these might be some beginner questions. I was using cascade to find a bug in our CPU (which is based off of CVA6). Said bug only seems to occur when trying to boot Linux, and not with our handwritten assembly testcases, and that's when I found out about cascade. I managed to add the design repo to cascade, but I still have a couple of questions about finding the bug I hoped you could help with:

About basic blocks, cascade only seems to remove a couple of instructions in the smaller elf, if this expected behaviour?
I tried to compare the elf by manually running on spike, but it complained about the lack to tohost/fromhost symbols. Do we need to worry about those? (Or does the run duration depend on the SIMLEN variable?)

Thanks in advance!

flaviens commented 3 months ago

Hi @hakase56557, Thanks for reaching out!

flaviens commented 3 months ago

Using Cascade, did you find programs that run incorrectly on your CPU?
If yes, could you successfully run the reduction script of some such programs?

If you could run the reduction correctly (it is not guaranteed to work on all cases, it's still a research prototype :sweat_smile: ), then it produces two ELFs (the biggest correct and the smallest incorrect) and the corresponding assembly dumps. Looking at the difference between the two dumps is usually sufficient.

Hope it helps! Flavien

hakase56557 commented 3 months ago

Thanks for the reply @flaviens!

I did manage to find programs that run incorrectly (most of them seem to time out)
I ran the reduction script on a couple of descriptors, and they only seemed to remove 1-2 instructions.

If you could run the reduction correctly (it is not guaranteed to work on all cases, it's still a research prototype 😅 ), then it produces two ELFs (the biggest correct and the smallest incorrect) and the corresponding assembly dumps.

So as I understand it, do I need to do a bit of trial and error with multiple programs/descriptors?

Looking at the difference between the two dumps is usually sufficient.

You're suggesting that the program should be reduced sufficiently so that by looking at the difference, I should be able to recognize the instruction on which I need to focus on in the trace (like, at the VCD). This should result in no need of comparing with an ISS either, right?

flaviens commented 3 months ago

Hi @hakase56557, thanks for your reply! When you say it removes 1-2 instructions, do you refer to the difference of the two ELFs produced by the reduction script, correct? If yes, the diff is probably:

in the biggest working ELF, a jump fomlowed by another instruction that does not matter-
in the smallest buggy ELF, some instruction (that I will refer here as buggy instruction) followed by a jump.

If that's the case, you most likely found the buggy instruction. It will behave wrongly in the architectural context set up by the earlier part of the ELF. To know this architectural context, you can e.g., use spike until reaching this instruction (and without executing the instructiln of course) and dump the regs, etc.

This should be enough to tell you what's wrong. Hope it helps!

hakase56557 commented 3 months ago

I think that makes sense.

When you say it removes 1-2 instructions, do you refer to the difference of the two ELFs produced by the reduction script, correct?

Yep that's what's happening

I'll try proceeding as you suggest. Thanks for your help!