MIT IEEE URTC 2024. GSET 2024. Repository for the "MBASED: Practical Simplifications of Mixed Boolean-Arithmetic Obfuscation". A Binary Ninja decompiler plugin taking ideas from compiler construction to simplify obfuscated boolean expressions.
We will be benchmarking two potential ways to simplify boolean statements for our solver. This first technique leverages the SMT solver, z3. The goal of this issue is to write an optimization pass for our solver to leverage which uses the z3 backend, an SMT solver.
Action Items
[x] Create a file called solver/passes/z3_pass.py. Follow the format shown in template_pass.py as this is the required format for our solver module.
[x] Write a visitor called Z3MappingVisitor whose job is to visit all nodes and when a Var is reached, the goal is to initialize a z3.Bool object which represents the Boolean symbol and add it to a dict[str, z3.Bool]. This will allow us to quickly reference the same boolean objects / symbols when we translate the AST into a z3 equation. This should inherit from the Visitor class. A similar example is shown in the template_pass.py file. For example, a mapping of A | B will have { "A" : Bool("A"), "B" : Bool("B") }.
[x] Write a visitor called TranslateToZ3 which converts the AST to a sympy data structure. The goal of this is to represent the AST in a data structure that can be simplified by the sympy backend. For example, for A | B the data structure will be Or(Bool("A"), Bool("B"). There are a few ways to do it but one way is to inherit from RetVisitor and override all methods on the visitor to return part of the data structure. The visitVarRet should use the dictionary / mapping we created from the Z3MappingVisitor since we want to reuse the same instance of a variable rather than create a new instance every time. One quick note is that for identifying the And or Or, you may need to use isinstance Python function to see the the Expr node is an instance of an AndExpr or an OrExpr when trying to translate this inside the visit function for VarExpr. An example of on to use `RetVisitor is shown in this example pass.
[ ] After building the equation, call the simplify function from the z3 library. There are some parameters you can play around with on this library so if you have time, try and see if we can maybe optimize further. An example of how to call this library is shown here.
[x] Get the string representation of the simplified boolean expression and use the Parser to build this back into an AST to be returned.
Here is the general flow of our pass.
# Z3MappingVisitor produces a mapping of the string representation of `Var`s to the `z3.Bool` instances
# TranslateToZ3 produces the z3 representation of the boolean expression
# `simplify` is called on the boolean expression to produced the simplified output.
# pass the string representation of the simplified output back into the `Parser` so we can return the simplified AST.
We will be benchmarking two potential ways to simplify boolean statements for our solver. This first technique leverages the SMT solver, z3. The goal of this issue is to write an optimization pass for our solver to leverage which uses the z3 backend, an SMT solver.
Action Items
solver/passes/z3_pass.py
. Follow the format shown intemplate_pass.py
as this is the required format for our solver module.Z3MappingVisitor
whose job is to visit all nodes and when aVar
is reached, the goal is to initialize az3.Bool
object which represents the Boolean symbol and add it to adict[str, z3.Bool]
. This will allow us to quickly reference the same boolean objects / symbols when we translate the AST into a z3 equation. This should inherit from theVisitor
class. A similar example is shown in thetemplate_pass.py
file. For example, a mapping ofA | B
will have{ "A" : Bool("A"), "B" : Bool("B") }
.TranslateToZ3
which converts the AST to a sympy data structure. The goal of this is to represent the AST in a data structure that can be simplified by thesympy
backend. For example, forA | B
the data structure will beOr(Bool("A"), Bool("B")
. There are a few ways to do it but one way is to inherit fromRetVisitor
and override all methods on the visitor to return part of the data structure. ThevisitVarRet
should use the dictionary / mapping we created from theZ3MappingVisitor
since we want to reuse the same instance of a variable rather than create a new instance every time. One quick note is that for identifying theAnd
orOr
, you may need to useisinstance
Python function to see the theExpr
node is an instance of anAndExpr
or anOrExpr
when trying to translate this inside the visit function forVarExpr
. An example of on to use `RetVisitor is shown in this example pass.simplify
function from thez3
library. There are some parameters you can play around with on this library so if you have time, try and see if we can maybe optimize further. An example of how to call this library is shown here.Parser
to build this back into an AST to be returned.Here is the general flow of our pass.
Resources
Specifically, check out the "Boolean Logic" section of this tutorial. It shows all of the relevant z3 functions and classes we will be using.