cisco-open / llvm-crash-analyzer

llvm crash analysis
Apache License 2.0
40 stars 17 forks source link

[TA] Fix startTaint to consider only memory operands #45

Closed niktesic closed 1 year ago

niktesic commented 1 year ago

Description (part 1)

This PR fixes https://github.com/cisco-open/llvm-crash-analyzer/issues/35.

As explained in https://github.com/cisco-open/llvm-crash-analyzer/issues/35 (#issue35), it is important to start taint for all memory operands of the crash-start instruction, since memory access can cause crash (Segmentation Fault). On the other hand, simple register operands and immediate operands cannot cause crash, since they are not used for memory access in the crash-start instruction.

Let's consider following expression (memory operand), which describes address in x86_64 architecture:

Address = base-register + (scale + index-register) + offset

In the LLVM MIR, each one of these entities (base-reg, scale, index-reg, offset) is a separate MachineOperand of the MachineInstr. It is not necessary to use each one of these MachineOperands in the address expression. Usually, memory address is expressed only as base-register + offset, where base-reg could be $noreg in some cases (ex. global variable absolute address). Currently, we don't support tracking memory addresses with "scaled-index" (where index-reg is used).

Introduced changes 1 (commit 1)

To match the described address format, this PR performs a small change of DestSourcePair implementation in TargetInstrInfo. Additionally, to be able to detect memory operands in DestSourcePair, the offset for corresponding base-reg (Destination, Source or Source2) is set by default for each memory operand (is not equal to Optional::None).

Description (part 2)

From described memory operand, we should consider as potentially broken, the following locations:

  1. whole memory address (currently, only for base-reg + offset memory locations)
  2. base register (simple register location base-reg)
  3. index register (simple register location index-reg)

Introduced changes 2 (commit 1)

This PR, also, changes the implementation of TaintAnalysis::startTaint, in a way that it tries to create a TaintInfo and add it to the TaintList, only if it comes from a memory operand and it represents a whole memory address, a base register or an index register.

Test example from https://github.com/cisco-open/llvm-crash-analyzer/issues/35

Before: image

After: new

The branch in the TaintDFG which comes from incorrectly tainted simple register operand $ecx is cut off by this PR.

Additional changes (commit 2)

This PR also adds support for creation of DestSourcePair for some instructions and to skip analysis of RET instructions when encountered during Backwards traversal. Also, when scale or index-reg MachineOperand is $noreg, corresponding field of DestSourcePair is set to nullptr.