secure-software-engineering / phasar

A LLVM-based static analysis framework.
Other
933 stars 140 forks source link

Dataflow facts disappear when using PropagateOntoStrategy in backward analysis #729

Open yuffon opened 3 months ago

yuffon commented 3 months ago

Hi, Recently, I meet one issue when using PropagateOntoStrategy (branch f-IDESolverStrategy) for backward analysis.

My backward dataflow analysis is a specialized typestate analysis. Flow functions are as follows.

call-to-return: killall function call flow: identity function ret flow: identity function summary and normal flow: advance automata.

The program structure is like the following.


//point 1
if(!callSomeFunction()){
    ......
}

void callSomeFunction(){
    //point 2
}

I have checked the dataflow results. At point 2, data flow facts are correct. But point 1 has no data flow facts. Since this is backward analysis. It seems like getCallFlowFunction kills all facts. But I use identity function for both getRetFlowFunction or getCallFlowFunction. I don't know whether this is a bug.

By the way, it seems that f-IDESolverStrategy has not been merged in to main branch.

yuffon commented 2 months ago

Hi, I have debugged Phasar line by line and found why. In my target program, the main function has exit(0) as exit points. I firstly found that Phasar does not support exit() as exit points when calling getAllExitPoints(). So in my backward analysis, I set all exit() statements as initial seeds. This gives a good start for dataflow analysis. Everything works well until the Phase II(ii) of IDE algorithm. At line 1154 of IDESolverImpl.h,

const auto &AllNonCallStartNodes = self().getAllValueComputationNodes();
self().valueComputationTask(AllNonCallStartNodes);

Phasar computes values for all non call or start nodes. AllNonCallStartNodes also includes all exit() instructions in main. But in valueComputationTask(), line 1239, ICF->getStartPointsOf(ICF->getFunctionOf(n) gets nothing when n is exit() instruction. So, only treating exit() as seed for backward analysis is not enough.

For backward analysis, we should tag all exit() as start points of main function when exit() is used as seed.

fabianbs96 commented 2 months ago

Hi @yuffon, that is indeed a problem, but I am not sure that it is a good idea to treat exit() as a start point of a function in the backwards icfg: If entering a function from a call-site, we will never reach this stmt. Can you try setting all exit()-calls as seeds for your analysis?

yuffon commented 1 month ago

Hi @yuffon, that is indeed a problem, but I am not sure that it is a good idea to treat exit() as a start point of a function in the backwards icfg: If entering a function from a call-site, we will never reach this stmt.

I mean, it looks like that we can treat exit() as a start point just for main function. For other functions called in ICFG, it is no need to do this because Phase II(ii) of IDE algorithm would compute edge functions for these functions.