secure-software-engineering / phasar

A LLVM-based static analysis framework.
Other
919 stars 140 forks source link

Handling getCalledFunction() method #631

Closed BowenZhang-UST closed 1 month ago

BowenZhang-UST commented 1 year ago

Bug description

Dear developers, In lib/PhasarLLVM/DataFlow/IfdsIde/Problems/IDESecureHeapPropagation.cpp Line 88 The return value of getCalledFunction() is not checked for null, which may cause a crash if it is an inline assembly call or a function pointer call.

image

Could you have a look, thanks!

Context (Environment)

The commit I use is lattest default.

Possible solution

Change the Line 89 to:

  llvm::StringLiteral FName = "";
  if(CS->getCalledFunction()) {
    FName = CS->getCalledFunction()->getName();
  } 
fabianbs96 commented 1 year ago

Hi @BowenZhang-UST, thank you for reporting this issue. You are right, this part of the code is indeed erroneous. We will see, when we have the time to fix this; otherwise you are welcome to submit a pull request.

yuffon commented 8 months ago

Hi @BowenZhang-UST, thank you for reporting this issue. You are right, this part of the code is indeed erroneous. We will see, when we have the time to fix this; otherwise you are welcome to submit a pull request.

Hi @fabianbs96, how phasar deal with function pointer? I see that some framework such as SVF can do point-to analysis and fix ICFG with function pointers. Does phasar just neglect edges on function pointer?

fabianbs96 commented 8 months ago

Hi @yuffon, PhASAR supports indirect calls via function pointers and vtables. The IDESolver will automatically provide the resolved callees/callee-candidates as additional parameters to the respective functions. So, the functions getCallToRetFlowFunction and getCallToRetEdgeFunction receive an additional llvm::ArrayRef<f_t> containing all functions that may be called at the given call-site. Also, the functions getCallFlowFunction and getCallEdgeFunction receive a single callee as parameter and get automatically called (potentially multiple times) for each callee candidate.

How these indirect calls are resolved depends on the ICFG implementation. Currently, the LLVMBasedICFG supports either of CHA, RTA or OTF (which uses alias information) to resolve such indirect call sites. You can control this by setting the CallGraphAnalysisType ctor parameter of LLVMBasedICFG; the default is OTF. The algorithm for determining alias information in the OTF mode depends on the LLVMAliasInfoRef that you can pass as additional parameter to the ctor; currently the implementation we provide LLVMAliasSet builds upon LLVM's built-in alias analyses.

Does this answer your question?