Open yuffon opened 1 year ago
Hi @yuffon, yes that sounds like a bug. Can you provide the analysis code, such that we can debug it?
Hi @yuffon, yes that sounds like a bug. Can you provide the analysis code, such that we can debug it?
My project relies on other components. I am constructing a simple case and will report it later.
Hi @yuffon, yes that sounds like a bug. Can you provide the analysis code, such that we can debug it?
Hi @fabianbs96, I am isolating my typestate analysis from other parts of my project so that it can run independently. But I find something may be useful.
My flow functions are designed as follows.
Normal Flow Function: for sensitive events -> push finite state automata other cases -> identity flow function
Call flow function: sensitive events (where function calls are sensitive events) -> kill all facts other cases -> identity (dataflow enter normal functions)
call to ret flow functions: ensitive events -> push automata other cases -> kill all facts (data flows are processed by function body)
return flow functions: identity flow function
Currently, these flow functions works well for Propagate Over strategy. But data flow facts vanishes after funciton call in the above example when using Propagate Onto strategy.
However, if I change the call to ret flow function as follows:
call to ret flow functions: ensitive events -> push automata other cases -> identity (change here, from killall to identity)
In the above example, the dataflow results after calling use_after_free
function include two facts:
N: store i32 0, i32* %retval, align 4, !psr.id !19 | ID: 15
-----------------------------------------------------------
D: Fact :{ Obj:11,state:0 } | V: TOP
D: Λ | V: TOP
N: %call = call i32 @use_after_free(), !psr.id !20 | ID: 16
-----------------------------------------------------------
D: Λ | V: TOP
D: Fact :{ Obj:11,state:0 } | V: TOP
D: Fact :{ Obj:11,state:3 } | V: TOP
N: store i32 %call, i32* %i, align 4, !psr.id !21 | ID: 17
----------------------------------------------------------
D: Λ | V: TOP
D: Fact :{ Obj:11,state:0 } | V: TOP
D: Fact :{ Obj:11,state:3 } | V: TOP
These results conform to flow functions.
Hi @yuffon, thatks for the details. Based on your description, I have some comments on the handling of function calls (not sure whether you already handle it like this):
From your description, I cannot see whether you do the parameter mapping correctly in your flow functions. Otherwise your approach sounds reasonable
Hi @yuffon, thatks for the details. Based on your description, I have some comments on the handling of function calls (not sure whether you already handle it like this):
- The call-flow function is responsible for mapping the arguments from the call-site to the formal parameters of the called function. Globals are usually passed as identity and all other facts are killed.
- The return-flow function basically computes the inverse of the call-flow function: It maps the returned value to the call-site and maps all pointer-parameters (that may have changed within the callee) back to the respective arguments at the call-site. All other facts are killed.
- The call-to-ret flow function then handles the propagation of all facts that are unrelated to the call (and all non-pointer arguments due to call-by-value) via identity and kills all others. Sometimes, you also perform special computations here, but usually that's done by the summary-flow function.
- The summary-flow function is used to model the effects of special functions to the dataflow facts. This can be used for handling source/sink functions in a taint analysis, or modeled API functions in a typestate analysis (to advance in the automaton). Note, that the call-flow function will not be called if you return a non-null flow function from getSummaryFlowFunction and the analysis of the called function will be skipped; in such cases this is usually, what you want
From your description, I cannot see whether you do the parameter mapping correctly in your flow functions. Otherwise your approach sounds reasonable
Thanks a lot for your detailed comments. I used a simple and lazy implementation. I will revise it and check again.
I have an issue with the PropagateOnto strategy. If I use the default PropagateOver strategy, my typestate code works well. But when using PropagateOnto strategy, some dataflow facts vanish after invoking function.
Basically, I use an finite state machine describing pointers cannot be used after free. The analyzed code is
The IR file is
The dumped results of using PropagateOver strategy is
Everything is OK up to now. But with the propagateOnto strategy, the dumped results are
As shown above, the last two instructions after calling the function have no dataflow facts. It seems that dataflow facts do not come out from the invoked function. In fact, I use identity flow function for all return edges. Besides, the default PropagateOver strategy works well, so it may not be the fault of flow functions.