SVF-tools / SVF

Static Value-Flow Analysis Framework for Source Code
http://svf-tools.github.io/SVF/
Other
1.43k stars 435 forks source link

taint analysis #116

Open here4thee opened 5 years ago

here4thee commented 5 years ago

Hi,

I have some questions about using your tool for taint analysis. I want to annotate all function arguments (user inputs) as sources, and annotate left operands of add/sub/mul operations in LLVM as sinks. Then I need to check the reachability from sources to sinks.

  1. How can I get names or nodeIDs of function arguments from symbol table?
  2. How can I filter out add/sub/mul operations (LLVM statements) and get the names or nodeIDs of the left operands?
  3. If reachable from source to sink, can I solve the value of the source to reach sink based on constraint graph?

For example, in the following LLVM assembly code, %"assign_value.1" is the left operand of add operation, and %".15" propogates data flow to it. We can see that %".15" loads from %"_value_transfer1Function" which gets value from %".1" (user input). So we annotate %".1" as source and %"assign_value.1" as sink. Apparently, the reachability is yes. Next, we want to solve the value of %".1" because we need the code flows into "endRequire" block instesd of "sanityCheckUnsatisfied". So we want to know what value should users feed to %".1" when this statement

br i1 %"notNull", label %"endRequire", label %"sanityCheckUnsatisfied"

satisfies the %"notNull" condition and branch to label %"endRequire".

Here is the LLVM code example,

define i32 @"transfer1Function"(i32 %".1") 

{

entry:

  %"_value_transfer1Function" = alloca i32

  store i32 %".1", i32* %"_value_transfer1Function"

  %"binary_result" = alloca i32

  br label %"requireStart"

requireStart:

  %".5" = load i32, i32* @"balanceOfSender_global"

  %".6" = load i32, i32* %"_value_transfer1Function"

  %"cmpOP" = icmp uge i32 %".5", %".6"

  %".7" = zext i1 %"cmpOP" to i32

  store i32 %".7", i32* %"binary_result"

  %".9" = load i32, i32* %"binary_result"

  %"notNull" = icmp ne i32 %".9", 0

  br i1 %"notNull", label %"endRequire", label %"sanityCheckUnsatisfied"

sanityCheckUnsatisfied:

  ret i32 -1

endRequire:

  %".12" = load i32, i32* %"_value_transfer1Function"

  %".13" = load i32, i32* @"balanceOfSender_global"

  %"assign_value" = sub i32 %".13", %".12"

  store i32 %"assign_value", i32* @"balanceOfSender_global"

  %".15" = load i32, i32* %"_value_transfer1Function"

  %".16" = load i32, i32* @"balanceOfReceiver_global"

  %"assign_value.1" = add i32 %".16", %".15"

  store i32 %"assign_value.1", i32* @"balanceOfReceiver_global"

  ret i32 0

}

Thanks

yuleisui commented 5 years ago

Please see my inline reply below:

Hi,

I have some questions about using your tool for taint analysis. I want to annotate all function arguments (user inputs) as sources, and annotate left operands of add/sub/mul operations in LLVM as sinks. Then I need to check the reachability from sources to sinks.

1. How can I get names or nodeIDs of function arguments from symbol table?

You can use pag->getValueNode(llvm_value);

2. How can I filter out add/sub/mul operations (LLVM statements) and get the names or nodeIDs of the left operands?

This is LLVM APIs. Is getOperand(0) you want?

3. If reachable from source to sink, can I solve the value of the source to reach sink based on constraint graph?

You can, but constraint graph is flow-insensitive. I suggest you take a look at SVFG and Saber.

For example, in the following LLVM assembly code, %"assign_value.1" is the left operand of add operation, and %".15" propogates data flow to it. We can see that %".15" loads from %"_value_transfer1Function" which gets value from %".1" (user input). So we annotate %".1" as source and %"assign_value.1" as sink. Apparently, the reachability is yes. Next, we want to solve the value of %".1" because we need the code flows into "endRequire" block instesd of "sanityCheckUnsatisfied". So we want to know what value should users feed to %".1" when this statement

For current SVFG, the nodes on it represent pointers rather than scalars. Please see a thread here: https://github.com/SVF-tools/SVF/issues/104

You may possibly seek help from there.

br i1 %"notNull", label %"endRequire", label %"sanityCheckUnsatisfied"

satisfies the %"notNull" condition and branch to label %"endRequire".

Here is the LLVM code example,

define i32 @"transfer1Function"(i32 %".1") 

{

entry:

  %"_value_transfer1Function" = alloca i32

  store i32 %".1", i32* %"_value_transfer1Function"

  %"binary_result" = alloca i32

  br label %"requireStart"

requireStart:

  %".5" = load i32, i32* @"balanceOfSender_global"

  %".6" = load i32, i32* %"_value_transfer1Function"

  %"cmpOP" = icmp uge i32 %".5", %".6"

  %".7" = zext i1 %"cmpOP" to i32

  store i32 %".7", i32* %"binary_result"

  %".9" = load i32, i32* %"binary_result"

  %"notNull" = icmp ne i32 %".9", 0

  br i1 %"notNull", label %"endRequire", label %"sanityCheckUnsatisfied"

sanityCheckUnsatisfied:

  ret i32 -1

endRequire:

  %".12" = load i32, i32* %"_value_transfer1Function"

  %".13" = load i32, i32* @"balanceOfSender_global"

  %"assign_value" = sub i32 %".13", %".12"

  store i32 %"assign_value", i32* @"balanceOfSender_global"

  %".15" = load i32, i32* %"_value_transfer1Function"

  %".16" = load i32, i32* @"balanceOfReceiver_global"

  %"assign_value.1" = add i32 %".16", %".15"

  store i32 %"assign_value.1", i32* @"balanceOfReceiver_global"

  ret i32 0

}

Thanks