secure-software-engineering / phasar

A LLVM-based static analysis framework.
Other
919 stars 140 forks source link

Phasar does not support exit() as exit points when calling getAllExitPoints() #728

Open yuffon opened 6 days ago

yuffon commented 6 days ago

I am using an old branch f-IDESolverStrategy because I need PropagateOntoStrategy in my project. (https://github.com/secure-software-engineering/phasar/tree/f-IDESolverStrategy)

Recently, I find that Phasar does not support using exit() as exit point of function.

For example, the source code is


#include <stdlib.h>
#include <stdio.h>

int main(){
    int a = 0;
    a ++;
    printf("exit\n");

    exit(0);
}

The IR is

; ModuleID = 'main-exit.c'
source_filename = "main-exit.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@.str = private unnamed_addr constant [6 x i8] c"exit\0A\00", align 1

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @main() #0 {
  %1 = alloca i32, align 4
  %2 = alloca i32, align 4
  store i32 0, i32* %1, align 4
  store i32 0, i32* %2, align 4
  %3 = load i32, i32* %2, align 4
  %4 = add nsw i32 %3, 1
  store i32 %4, i32* %2, align 4
  %5 = call i32 (i8*, ...) @printf(i8* noundef getelementptr inbounds ([6 x i8], [6 x i8]* @.str, i64 0, i64 0))
  call void @exit(i32 noundef 0) #3
  unreachable
}

declare dso_local i32 @printf(i8* noundef, ...) #1

; Function Attrs: noreturn nounwind
declare dso_local void @exit(i32 noundef) #2

attributes #0 = { noinline nounwind optnone uwtable "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
attributes #1 = { "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
attributes #2 = { noreturn nounwind "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
attributes #3 = { noreturn nounwind }

!llvm.module.flags = !{!0, !1, !2}
!llvm.ident = !{!3}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 7, !"uwtable", i32 1}
!2 = !{i32 7, !"frame-pointer", i32 2}
!3 = !{!"clang version 14.0.6"}

The code using Phasar is as follows:

std::vector EntryPoints = {"main"s};

  HelperAnalyses HA(Argv[1], EntryPoints);
  if (!HA.getProjectIRDB().isValid()) {
    return 1;
  }

  if (const auto *F = HA.getProjectIRDB().getFunctionDefinition("main")) {
    llvm::outs()  << F->getName().str();
    for(auto inst : getAllExitPoints(F)){
      llvm::outs()  << inst->getName().str();
    }
  }

getAllExitPoints() gets nothing. Currently, I scan all instructions in a function and tag all calls to exit() as exit points.

fabianbs96 commented 4 days ago

Hi @yuffon, this is actually intended behavior. We use getAllExitPoints (and isExitInst) to get the locations from where data flows have to be mapped back to the callers. This happens for ret and resume instructions. For exit(), abort() and others the program ends more or less immediately and data-flows will never reach the callers from there. I see that the naming is confusing. Maybe, we can add another function that covers these cases as well.

yuffon commented 2 days ago

Hi @yuffon, this is actually intended behavior. We use getAllExitPoints (and isExitInst) to get the locations from where data flows have to be mapped back to the callers. This happens for ret and resume instructions. For exit(), abort() and others the program ends more or less immediately and data-flows will never reach the callers from there. I see that the naming is confusing. Maybe, we can add another function that covers these cases as well.

OK. Thanks. I need to get all exit points of main function for seed facts in backward dataflow analysis.