cisco-open / llvm-crash-analyzer

llvm crash analysis
Apache License 2.0
40 stars 17 forks source link

Starting Taint for simple register operands #35

Closed niktesic closed 1 year ago

niktesic commented 1 year ago

Test source code:

void fun(int* ptr, int a)
{
    *ptr = a;
}

int main(){
  int *p = 0;
  fun(p,4);
  return 0;
}

Compiled with: $ clang -g -O0 m.c -o m

Consider following MIR instruction, which is at the crash point in the presented test. crash-start MOV32mr $rax, 1, $noreg, 0, $noreg, $ecx

Currently, in the startTaint(), we try to insert to the Taint List all of the tree possible locations:

  1. Memory location at the address $rax + 0 (destination operand) - inserted
  2. Register $rax (destination operand base register) - not inserted, since it is treated equal to the location 1.
  3. Simple register location $ecx (source operand) - inserted

Is it valid to taint a simple register location, if it is not a base register of a memory operand? This is done in TaintAnalysis::startTaint().

In the presented case, the MIR level data flow graph looks like: dfg

From the DFG, is it obvious that path, which starts from register location $ecx leads to the wrong blame instruction. Based on the constructed DFG, llvm-crash-analyzer cannot decide which leaf node is the blame one.

ananthakri commented 1 year ago

In my opinion, the initial taint list should include the destination memory operand. By the semantics of the crashing instruction and SEGV violation, it is clear that the memory operand is tainted. The path on the right hand side in above DFG is not needed.

niktesic commented 1 year ago

In my opinion, the initial taint list should include the destination memory operand. By the semantics of the crashing instruction and SEGV violation, it is clear that the memory operand is tainted. The path on the right hand side in above DFG is not needed.

To my understanding, source operand of the crash instruction should be in the taint list if it is a memory operand. For example, in $rax = MOV $rbx, 8, invalid value of $rbx + 8 could cause crash in general case. In most cases, I've seen, the value of the base register $rbx is responsible for the crash. Anyways, any memory operand could cause crash and I don't think it matters if it is a destination or a source.

On the other hand, I think that simple register location, like $rax in the example above, cannot cause crash in any case, so we should not add it to the Taint List.

Thanks, Nikola