"Measure" difficulty of formulas

One of the main questions of the professors was: "Why are the dots so spread for larger formulas?"

We should be able to answer that question in a "scientific" way, i.e. something like this: "The formula on the top of the plot is very difficult to solve, the dpll algorithm needs to make 25 branching steps until it is able to solve the formula. On the other hand the formula on the bottom is very easy to solve, the dpll algorithm doesn't need to branch at all".

We should come up with a numerical representation of the difficulty of a formula, one suggestion (used above) is to use the maximal branching depth until a solution is found or unsat is proven. Note that that does not mean we can just count the number of "branches" taken during an execution, we need to associate true and false branches on the same variable and count them as one. Effectively we need to find the depth of the tree. But counting the number of branches in total would also be a metric, it would tell us how "bad" the decisions that our algorithm made are. I would suggest we do both...

A good starting point is to change the stderr output in dpll.cpp such that we record the necessary information. We could then create a python script that reads the stderr output and comes up with the branching depth and the overall number of taken branches.

I suggest that we just do it on the sequential version, as it will be identical to the parallel one (this is a property of the cnf formula).

limo1996 / SAT-Solver

"Measure" difficulty of formulas #19