ExpoSEJS / ExpoSE

A Dynamic Symbolic Execution (DSE) engine for JavaScript. ExpoSE is highly scalable, compatible with recent JavaScript standards, and supports symbolic modelling of strings and regular expressions.
MIT License
185 stars 36 forks source link

Documentation #53

Closed lf94 closed 5 years ago

lf94 commented 5 years ago

For now I'm going through the examples and other source material, but having some documentation bundled would be really nice. I'm looking at using ASTs and symbolic execution in general to help with verifying work and compare structures of code.

From some playing around I've deduced:

Still not sure what _bound means.

jawline commented 5 years ago

Sorry, I only just saw the issue. Yes, at the moment proper documentation is pretty sparse.

DSE works by executing the program concretely with fixed inputs, but tracking any operation that occurs to those fixed inputs as the program runs. Once the program has finished executing an SMT solver is used to decide whether there is any variation of that input that would have followed a different control flow path through the code. If the SMT solver tells us that there is an alternate input then we re-execute the program concretely with that new input. The process is repeated until all control flows are exhausted (full coverage) or, more realistically, some reasonable time limit has passed (many programs have an infinite number of possible walks through them).

The seed variables are just values used to drive this initial input. For example, if we start with a seed of x = 4 on the program

var x = S$.symbol('x', 4);
if (x < 4) { 
 throw 'Bug';
}

then the seed path will not throw, the SMT solver will tell us that modifying x such that it is less than 4 is possible, and we will retry with some x < 4, which will throw.

ExpoSE uses SAGE style symbolic execution (https://patricegodefroid.github.io/public_psfiles/ndss2008.pdf), in this style of symbolic execution the _bound variable is used to avoid repeating paths that have already been covered. The algorithm is explained in Figure 3 of the paper.

At some point in the future I want to make the path conditions more human-readable (though there will always be an element of crazyness to them, especially when expanding symbolic objects, due to the necessity for automated naming).

Does this help at all?