secure-software-engineering / phasar

A LLVM-based static analysis framework.
Other
933 stars 140 forks source link

IDE Solver Strategy #669

Open fabianbs96 opened 1 year ago

fabianbs96 commented 1 year ago

Currently, the IDESolver propagates data-flow information to the successor instruction of the current instruction. While this makes the implementation easier, for users this is sometimes unintuitive, mostly because the Dragonbook defines data-flow propagations differently: The effects of an instruction to the data flows should be visible at that same instruction already. In addition, some information at branching statements may have already been merged with the current approach; therefore, it may not always be possible to determine the origin basicblock of a particular data-flow fact (see LLVM's invoke instruction).

This PR addresses above issues by adding the following contributions:

yuffon commented 11 months ago

Hi @fabianbs96, I have an issue with the PropagateOnto strategy. If I use the default PropagateOver strategy, my typestate code works well. But when using PropagateOnto strategy, some dataflow facts vanish.

The analyzed code is

#include <stdio.h>
#include <malloc.h>

int use_after_free()
{
    int *p = (int *)malloc(16);
    *p = 1;
    free(p);
    *p = 2;
    return 0;
}

int main(void)
{

    int (* p)() = & use_after_free; 
    int i = p();

    return 0;
}

The IR file is

; ModuleID = 'func-pointer-use-after-free.c'
source_filename = "func-pointer-use-after-free.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @use_after_free() #0 {
entry:
  %p = alloca i32*, align 8
  %call = call noalias i8* @malloc(i64 noundef 16) #2
  %0 = bitcast i8* %call to i32*
  store i32* %0, i32** %p, align 8
  %1 = load i32*, i32** %p, align 8
  store i32 1, i32* %1, align 4
  %2 = load i32*, i32** %p, align 8
  %3 = bitcast i32* %2 to i8*
  call void @free(i8* noundef %3) #2
  %4 = load i32*, i32** %p, align 8
  store i32 2, i32* %4, align 4
  ret i32 0
}

; Function Attrs: nounwind
declare dso_local noalias i8* @malloc(i64 noundef) #1

; Function Attrs: nounwind
declare dso_local void @free(i8* noundef) #1

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @main() #0 {
entry:
  %retval = alloca i32, align 4
  %p = alloca i32 (...)*, align 8
  %i = alloca i32, align 4
  store i32 0, i32* %retval, align 4
  store i32 (...)* bitcast (i32 ()* @use_after_free to i32 (...)*), i32 (...)** %p, align 8
  %0 = load i32 (...)*, i32 (...)** %p, align 8
  %call = call i32 (...) %0()
  store i32 %call, i32* %i, align 4
  ret i32 0
}

attributes #0 = { noinline nounwind optnone uwtable "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
attributes #1 = { nounwind "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
attributes #2 = { nounwind }

!llvm.module.flags = !{!0, !1, !2}
!llvm.ident = !{!3}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 7, !"uwtable", i32 1}
!2 = !{i32 7, !"frame-pointer", i32 2}
!3 = !{!"clang version 14.0.6 (https://github.com/llvm/llvm-project.git f28c006a5895fc0e329fe15fead81e37457cb1d1)"}

The dumped results of using PropagateOver strategy is

N: %p = alloca i32*, align 8, !psr.id !4 | ID: 0
------------------------------------------------
    D: Fact :{ Obj:11,state:0 } | V: BOTTOM
    D: Λ | V: BOTTOM

N: %call = call noalias i8* @malloc(i64 noundef 16) #2, !psr.id !5 | ID: 1
--------------------------------------------------------------------------
    D: Λ | V: BOTTOM
    D: Fact :{ Obj:11,state:0 } | V: BOTTOM

N: %0 = bitcast i8* %call to i32*, !psr.id !6 | ID: 2
-----------------------------------------------------
    D: Λ | V: BOTTOM
    D: Fact :{ Obj:11,state:1 } | V: BOTTOM

N: store i32* %0, i32** %p, align 8, !psr.id !7 | ID: 3
-------------------------------------------------------
    D: Λ | V: BOTTOM
    D: Fact :{ Obj:11,state:1 } | V: BOTTOM

N: %1 = load i32*, i32** %p, align 8, !psr.id !8 | ID: 4
--------------------------------------------------------
    D: Λ | V: BOTTOM
    D: Fact :{ Obj:11,state:1 } | V: BOTTOM

N: store i32 1, i32* %1, align 4, !psr.id !9 | ID: 5
----------------------------------------------------
    D: Λ | V: BOTTOM
    D: Fact :{ Obj:11,state:1 } | V: BOTTOM

N: %2 = load i32*, i32** %p, align 8, !psr.id !10 | ID: 6
---------------------------------------------------------
    D: Λ | V: BOTTOM
    D: Fact :{ Obj:11,state:1 } | V: BOTTOM

N: %3 = bitcast i32* %2 to i8*, !psr.id !11 | ID: 7
---------------------------------------------------
    D: Λ | V: BOTTOM
    D: Fact :{ Obj:11,state:1 } | V: BOTTOM

N: call void @free(i8* noundef %3) #2, !psr.id !12 | ID: 8
----------------------------------------------------------
    D: Λ | V: BOTTOM
    D: Fact :{ Obj:11,state:1 } | V: BOTTOM

N: %4 = load i32*, i32** %p, align 8, !psr.id !13 | ID: 9
---------------------------------------------------------
    D: Λ | V: BOTTOM
    D: Fact :{ Obj:11,state:2 } | V: BOTTOM

N: store i32 2, i32* %4, align 4, !psr.id !14 | ID: 10
------------------------------------------------------
    D: Fact :{ Obj:11,state:2 } | V: BOTTOM
    D: Λ | V: BOTTOM

N: ret i32 0, !psr.id !15 | ID: 11
----------------------------------
    D: Fact :{ Obj:11,state:3 } | V: BOTTOM
    D: Λ | V: BOTTOM

============ Results for function 'main' ============

N: %p = alloca i32 (...)*, align 8, !psr.id !17 | ID: 13
--------------------------------------------------------
    D: Fact :{ Obj:11,state:0 } | V: BOTTOM
    D: Λ | V: BOTTOM

N: %i = alloca i32, align 4, !psr.id !18 | ID: 14
-------------------------------------------------
    D: Λ | V: TOP
    D: Fact :{ Obj:11,state:0 } | V: TOP

N: store i32 0, i32* %retval, align 4, !psr.id !19 | ID: 15
-----------------------------------------------------------
    D: Fact :{ Obj:11,state:0 } | V: TOP
    D: Λ | V: TOP

N: store i32 (...)* bitcast (i32 ()* @use_after_free to i32 (...)*), i32 (...)** %p, align 8, !psr.id !20 | ID: 16
------------------------------------------------------------------------------------------------------------------
    D: Λ | V: TOP
    D: Fact :{ Obj:11,state:0 } | V: TOP

N: %0 = load i32 (...)*, i32 (...)** %p, align 8, !psr.id !21 | ID: 17
----------------------------------------------------------------------
    D: Λ | V: TOP
    D: Fact :{ Obj:11,state:0 } | V: TOP

N: %call = call i32 (...) %0(), !psr.id !22 | ID: 18
----------------------------------------------------
    D: Λ | V: BOTTOM
    D: Fact :{ Obj:11,state:0 } | V: BOTTOM

N: store i32 %call, i32* %i, align 4, !psr.id !23 | ID: 19
----------------------------------------------------------
    D: Λ | V: TOP
    D: Fact :{ Obj:11,state:3 } | V: TOP

N: ret i32 0, !psr.id !24 | ID: 20
----------------------------------
    D: Fact :{ Obj:11,state:3 } | V: TOP
    D: Λ | V: TOP

Everything is OK up to now. But with the propagateOnto strategy, the dumped results are


N: %p = alloca i32*, align 8, !psr.id !4 | ID: 0
------------------------------------------------
    D: Fact :{ Obj:11,state:0 } | V: BOTTOM
    D: Λ | V: BOTTOM

N: %call = call noalias i8* @malloc(i64 noundef 16) #2, !psr.id !5 | ID: 1
--------------------------------------------------------------------------
    D: Fact :{ Obj:11,state:1 } | V: BOTTOM
    D: Λ | V: BOTTOM

N: %0 = bitcast i8* %call to i32*, !psr.id !6 | ID: 2
-----------------------------------------------------
    D: Fact :{ Obj:11,state:1 } | V: BOTTOM
    D: Λ | V: BOTTOM

N: store i32* %0, i32** %p, align 8, !psr.id !7 | ID: 3
-------------------------------------------------------
    D: Λ | V: BOTTOM
    D: Fact :{ Obj:11,state:1 } | V: BOTTOM

N: %1 = load i32*, i32** %p, align 8, !psr.id !8 | ID: 4
--------------------------------------------------------
    D: Fact :{ Obj:11,state:1 } | V: BOTTOM
    D: Λ | V: BOTTOM

N: store i32 1, i32* %1, align 4, !psr.id !9 | ID: 5
----------------------------------------------------
    D: Λ | V: BOTTOM
    D: Fact :{ Obj:11,state:1 } | V: BOTTOM

N: %2 = load i32*, i32** %p, align 8, !psr.id !10 | ID: 6
---------------------------------------------------------
    D: Fact :{ Obj:11,state:1 } | V: BOTTOM
    D: Λ | V: BOTTOM

N: %3 = bitcast i32* %2 to i8*, !psr.id !11 | ID: 7
---------------------------------------------------
    D: Fact :{ Obj:11,state:1 } | V: BOTTOM
    D: Λ | V: BOTTOM

N: call void @free(i8* noundef %3) #2, !psr.id !12 | ID: 8
----------------------------------------------------------
    D: Λ | V: BOTTOM
    D: Fact :{ Obj:11,state:2 } | V: BOTTOM

N: %4 = load i32*, i32** %p, align 8, !psr.id !13 | ID: 9
---------------------------------------------------------
    D: Fact :{ Obj:11,state:2 } | V: BOTTOM
    D: Λ | V: BOTTOM

N: store i32 2, i32* %4, align 4, !psr.id !14 | ID: 10
------------------------------------------------------
    D: Fact :{ Obj:11,state:3 } | V: BOTTOM
    D: Λ | V: BOTTOM

============ Results for function 'main' ============

N: %p = alloca i32 (...)*, align 8, !psr.id !17 | ID: 13
--------------------------------------------------------
    D: Fact :{ Obj:11,state:0 } | V: TOP
    D: Λ | V: TOP

N: %i = alloca i32, align 4, !psr.id !18 | ID: 14
-------------------------------------------------
    D: Λ | V: TOP
    D: Fact :{ Obj:11,state:0 } | V: TOP

N: store i32 0, i32* %retval, align 4, !psr.id !19 | ID: 15
-----------------------------------------------------------
    D: Λ | V: TOP
    D: Fact :{ Obj:11,state:0 } | V: TOP

N: store i32 (...)* bitcast (i32 ()* @use_after_free to i32 (...)*), i32 (...)** %p, align 8, !psr.id !20 | ID: 16
------------------------------------------------------------------------------------------------------------------
    D: Λ | V: TOP
    D: Fact :{ Obj:11,state:0 } | V: TOP

N: %0 = load i32 (...)*, i32 (...)** %p, align 8, !psr.id !21 | ID: 17
----------------------------------------------------------------------
    D: Λ | V: TOP
    D: Fact :{ Obj:11,state:0 } | V: TOP

N: %call = call i32 (...) %0(), !psr.id !22 | ID: 18
----------------------------------------------------
    D: Λ | V: TOP

N: store i32 %call, i32* %i, align 4, !psr.id !23 | ID: 19
----------------------------------------------------------
    D: Λ | V: TOP

As shown above, the last two instructions after calling the function have no dataflow facts. It seems that dataflow facts do not come out from the invoked function. In fact, I use identity flow function for all return edges. Besides, the default strategy works well, so it may not be the fault of flow functions.

fabianbs96 commented 9 months ago

Hi @yuffon,

I have debugged your issue. There was indeed a bug in the PropagateOntoStrategy implementation. Can you check, whether the current version works?

In fact, I use identity flow function for all return edges.

Besides that, it is not a good idea to skip fact-mapping in the call-flow-function and in the ret-flow-function. This way you will not get context-sensitive results at the end

yuffon commented 9 months ago

It works on my problem now. Great. Thanks.

At 2023-12-14 02:34:19, "Fabian Schiebel" @.***> wrote:

Hi @yuffon,

I have debugged your issue. There was indeed a bug in the PropagateOntoStrategy implementation. Can you check, whether the current version works?

In fact, I use identity flow function for all return edges.

Besides that, it is not a good idea to skip fact-mapping in the call-flow-function and in the ret-flow-function. This way wou will not get context-sensitive results at the end

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

yuffon commented 3 months ago

Hi @yuffon,

I have debugged your issue. There was indeed a bug in the PropagateOntoStrategy implementation. Can you check, whether the current version works?

In fact, I use identity flow function for all return edges.

Besides that, it is not a good idea to skip fact-mapping in the call-flow-function and in the ret-flow-function. This way you will not get context-sensitive results at the end

Hi Fabian, Recently, I meet one issue when using PropagateOntoStrategy for backward analysis.

My backward dataflow analysis is a specialized typestate analysis. Flow functions are as follows.

call-to-return: killall function call flow: identity function ret flow: identity function summary and normal flow: advance automata.

The program structure is like the following.


//point 1
if(!callSomeFunction()){
    ......
}

void callSomeFunction(){
    //point 2
}

I have checked the dataflow results. At point 2, data flow facts are correct. But point 1 has no data flow facts. Since this is backward analysis. It seems like getRetFlowFunction or getCallFlowFunction kills all facts. But I use identity function for both getRetFlowFunction or getCallFlowFunction. I don't know whether this is a bug. By the way, it seems that f-IDESolverStrategy has not been merged in to main branch.