plast-lab / doop

The official repo of Doop, the declarative pointer analysis framework.
http://plast-lab.github.io/
Other
160 stars 21 forks source link

Detected non-deterministic results under various configurations #8

Open AnnabellaM opened 11 months ago

AnnabellaM commented 11 months ago

Hi, I have recently been using Doop for an empirical study to detect non-deterministic behaviors in static analyzers. The experiments resulted in discovering some nondeterministic analysis results across multiple runs under various configurations of Doop.

The details of the experimental setup are as below:

In the end, the experiments detected non-deterministic results on 6 programs. None of the programs were from CATS, and all 6 programs were from the DaCapo-2006. These results were detected when the configuration sets its analysis option to context-insensitive.

The attached data is the detected nondeterministic results from CATS and DaCapo-2006 and configuration files (note1: the configurations are hash-coded in the detected results, but the actual configuration options and values that each hash code stands for can be found in the attached configuration files.)

yanniss commented 11 months ago

Hello and thanks for your work. It's not surprising that Doop exhibits non-determinism. There are several sources of non-determinism. First, even the input facts of the analysis are non-deterministic, based on the Soot framework, whose output depends on the order of finding classes. Second, some algorithms in our reflection analysis (at least) are randomized: they elect a representative from an equivalence class of classes and process the classes based on the representatives.

The hope is that the very high-level outputs will be mostly deterministic. But if one wants strict determinism, they need to at least use a previously produced set of input facts, instead of re-deriving them. See doop --help fact-generation, probably the --facts-only flag and later the --input-id flag, over the previously-produced facts.

Even this may not be enough, depending on the reflection setting, but it will remove the main source of non-determinism.