Open Symen opened 7 years ago
Hi Symen,
Good question!
The differences between your two tests are due to our inter-procedural Mod-Ref analysis. Please refer to line 346-369 in MemRegion.cpp (https://github.com/unsw-corg/SVF/blob/master/lib/MSSA/MemRegion.cpp).
Any global objects and the objects they point to inside a callee function, say F, will be promoted as a side-effect to the callsites that call F (transitively along the call chain).
We treat global objects (e.g., "global_p1" ) and objects (e.g., "local_x") on its Andersen's points-to graph conservatively in a context-insensitive manner, so in your case, "local_x" will be put as a side-effect before the two "init" callsites.
You may wish to take a look at the "SVFG_before_opt" file to see those side-effect objects inside the yellow rectangle boxes.
Thanks
Thanks for the explanation!
Would a context-sensitive SVFG be possible or is there another data-structure in order to check, if such a path (like in the example above) is possible?
The current Mod-Ref analysis supports partial context-sensitivity. Full context-sensitive Mod-Ref analysis, including handling globals, requires a context-sensitive pointer analysis. SVF does not support it yet. You may wish to write one by leveraging the existing code in MemRegion.cpp.
Another way is to write a pass to refine the SVFG e.g., implementing an optimization pass like SVFGOPT.
We have a recent paper describing how to refine the spurious value-flows. Please refer to http://www.cse.unsw.edu.au/~corg/supa/.
I am not very experienced with the implementation of static analyses, but I will give it a try. Which of the two approaches would you prefer? My guess would be that the leverage of the MemRegion.cpp might be a bit easier. Or am I wrong?
Great paper by the way! This should be helpful.
Either writing a pass before (MemRegion.cpp) or after (SVFGOPT.cpp) creating SVFG is doable.
If you have already taken a look at the Mod-Ref analysis, MemRegion.cpp is a good starting point.
If you still want to optimize the graph once it is built, such as removing edges and nodes, then SVFGOPT.cpp is also a good example to follow.
Good luck!
Unfortunately, I had no time for it the last weeks, but I tried to understand the code in detail. As you mentioned, the framework requires a context-sensitive pointer analysis. I found the CondPTAImpl class (PointerAnalysis.h) which seems to be an approach of a context-sensitive analysis. Wouldn't it be necessary to leverage the Andersen's pointer analysis as you described in this paper Making context-sensitive inclusion-based pointer analysis practical for compilers using parameterised summarisation?
I also was wondering whether SVF allows the implementation of sound analyses, or if there is some implementation detail, that causes analyses to be unsound in principle?
Thanks again
On context-sensitivity
Yes, CondPTAImpl is designed for context- or path-sensitive analysis. Every pointer or abstract object is qualified by a condition e.g., (call stack or bdd for path-sensivitiy). To understand more about how to use the data structures in CondPTAImpl, you may wish to refer to “DDA/ContextDDA.h/cpp” in SUPA’s implementation.
Our whole-program context-sensitive inclusion-based pointer analysis was originally implemented in Open64 compiler. We don’t have time to migrate it to SVF:) But the SPE paper gives all the algorithms and evaluation details of the analysis. Yes, it is definitely possible to implement one in SVF.
On soundness
Short answer: Yes, SVF allows sound analysis if every LLVM instruction (relevant to pointer analysis) is modeled in a conservative way.
Long story: soundness in static analysis generally means over-approximations of a program’s behavior. Many research papers claim their approaches are sound by modeling and analyzing the programs (e.g., in the form of inference rules) written in their self-defined tiny languages. However, under real-world setting, a tiny language may be only a subset of a modern intermediate representation, like LLVM IR.
To soundly model the program behavior, SVF need to handle as many instructions as possible (refer to https://github.com/unsw-corg/SVF/wiki/Technical-documentation#123-pagbuilder) to generate high-level representations (e.g., graphs). Another challenge for sound analysis is to model the side-effects of library calls (refer to https://github.com/unsw-corg/SVF/blob/master/lib/Util/ExtAPI.cpp). Of course, C programs which are not ANSI-compliant also pose challenge to soundness (e.g., pointer arithmetic underflow/overflow).
Hi, I'm currently working with your awesome SVF analysis. During some tests with the SVFG, I found a simple test-case that doesn't make sense to me.
Why does the SVFG in Test1 has a path from "local_x" into the second "init" call?
These are the result I get with: saber -leak -dump-svfg main.bc SVFG of Test1:
SVFG of Test2: