google / silifuzz

Apache License 2.0
391 stars 25 forks source link

Questions on Silifuzz measurement results on CloudLab #3

Open Maknee opened 1 year ago

Maknee commented 1 year ago

Hi Silifuzz,

We used Silifuzz to run a large-scale measurement (10K hours in total) on 200 CloudLab machines, to understand more about SDC characteristics. The detailed setups are attached below.

Surprisingly, we didn’t observe any SDC besides a false positive (which I reported in October).

The Google and Meta papers report that “on the order of a few mercurial cores per several thousand machines” and a “SDC occurrence rate of one in thousand silicon devices”.

I’m writing to ask whether you have any insights on our observation? Are SDCs you observed specific to certain CPU families (like Intel CPUs)? Or, you suggest an even large-scale testbed than CloudLab?

Our measurement setups are:

We made a few changes (https://github.com/xlab-uiuc/SDCBench):

ksteuck commented 1 year ago

hi @Maknee

Thanks for taking the time to try the tool and provide feedback! I cannot go into a lot of details but let me provide some quotes that may give you insight regarding the scale of the fleet and the scanning infrastructure backing the findings reported in the papers.

ksteuck commented 1 year ago

One actionable thing you can do is try to find better proxies to further explore the "fuzzing by proxy" concept. Potential candidates include XED, Unicorn v2, Bochs and any other x86 emulators you may have access to.

tianyin commented 1 year ago

@ksteuck Thank you for the answer!

We did read all the papers you quoted (Cores don't count, the Meta ones, and the Sillifuzz paper) and that's exactly where our questions came from :)

Basically we run Silifuzz on the largest fleet we can find in academia, i.e., CloudLab, but we were not able to observe SDCs.

We wonder whether it means we simply don't have a large enough fleet to measure/observe SDCs in an academic setting, or we measure the wrong CPUs, or we are doing something wrong that failed to capture SDCs? We'd love to hear your thoughts!

Regarding your last point, we ran the open-source Sillifuzz (which I believe uses Unicorn). Are you hinting that using a different one (e.g., XED) can better expose SDCs?

ksteuck commented 1 year ago

We wonder whether it means we simply don't have a large enough fleet to measure/observe SDCs in an academic setting, or we measure the wrong CPUs, or we are doing something wrong that failed to capture SDCs?

The answer may be "all of the above". I'm not familiar with the details of your setup but I can address the question of scale and content quality based on what has been published on the topic:

Regarding your last point, we ran the open-source Sillifuzz (which I believe uses Unicorn). Are you hinting that using a different one (e.g., XED) can better expose SDCs?

SiliFuzz itself is proxy-agnostic. We provide a sample Unicorn-based proxy but what I was suggesting is exploring other proxies to improve coverage. Better proxies should provide better coverage (or at least that's the heuristic behind SiliFuzz). Unicorn is a rather "weak" proxy.

Finally, we've recently published a corpus of about 300k snapshots based on Unicorn.