SVF-tools / SVF

Static Value-Flow Analysis Framework for Source Code
http://svf-tools.github.io/SVF/
Other
1.4k stars 437 forks source link

Can we avoid node from collapsing to field-insensitive #60

Open YueCao opened 6 years ago

YueCao commented 6 years ago

Hi, I'm using the default AndersonWaveDiff solver to run pointer analysis for c program. But I found the result is not precise, since a lot of nodes collapse to field-insensitive. By checking the code, it seems to be related to PWC Node. Could you give me more information about PWC Node , e.g., when it would be created? Besides, can we avoid node from collapsing to field-insensitive or any other ways to make the result more precise? Thanks!

YueCao commented 6 years ago

Hi, please correct me if I'm wrong. By reading the code, I found there are two cases that node would collapse to field-insensitive.

  1. variable GEP edge
  2. if it's a PWC Node

The first case is self-explained. But I still don't get the second case. For example, if gep edges inside SCC cycle, I don't know why its pts should be collapsed to field-insensitive? It will be great if you can give me a example. I'm not sure if I understand PWC correctly since I could not find its explanation online. Thanks in advance.

yuleisui commented 6 years ago

Hi Yue,

It is good that you read the code in depth.

Field-sensitive analysis of a struct object may become field-insensitive due to constraint cycles (PWC, positive weight cycles).

For example p=q q=p gep 1

The above llvm ir has a constraint cycle with one copy edge from q to p and one gep edge from p to q.

To trade precision for efficiency and avoid infinite derivations, we set the objects that p points to as field-insensitive. This is a standard way to handle field-sensitivity.

The two cases you point out are two scenarios when a node becomes a PWC node in a constraint cycle.

A PWC node is field-insensitive in our case. Any precision improvement you can think of?

YueCao commented 6 years ago

Thanks for your explanation, Yulei. I understand your IR example, but it's a little bit hard to replay it through source code, since IR follows a SSA form. If it's convenient for you, could you also show me a source code example? Thanks very much.

yuleisui commented 6 years ago

You may wish to take a look at the micro-benchmarks. We had a few cycle examples.

Hope this one can help you understand. https://github.com/SVF-tools/PTABen/blob/master/basic_c_tests/constraint-cycle-pwc.c

YueCao commented 6 years ago

Hi Yulei,

In order to improve the precision regarding field collapsing issue, one simple idea is to utilize type information to filter unrelated result during alias analysis. I guess it's implemented in AndersenWaveWithType solver, right? But you mentioned in #48 that AndersenWaveDiffWithType solver may not be conservative for C program. One possible reason is that type information cannot handle list retrieve well. Are there any other cases there? Please correct me if there's any misunderstanding. Thanks!

BTW, when I run your micro-benchmarks with SVF, it shows 2 NumOfSCCDetect, 1 TotalCycleNum, BUT 0 TotalPWCCycleNum according to statistic result. I wonder if this result is right?

yuleisui commented 6 years ago

"AndersenWaveDiffWithType" is an improvement over "AndersenWaveDiff" to increase precision. Its implementation is relatively easy. It is conservative for C but may not be C++. You may wish to continue to improve "AndersenWaveDiffWithType" if you like.

For the small PWC case, I think you should not optimise (don't use -mem2reg) the bc before analyzing it, otherwise, the PWC will not be opted out. Following our case and concept, you can also write yours to test PWC.

YueCao commented 6 years ago

I see. By checking the codes, AndersenWaveDiffWithType solver has nothing to do with field collapsing issue. Would you think it's a good idea (e.g., conservative) to add a separated type information for each field? This type information should not be merged during field collapsing process, so it can be used to filter Objs with different types, which are introduced by the other fields from the same struct.

yuleisui commented 6 years ago

It is doable. You may wish to take at look at PTAType for some extension if you like: https://github.com/SVF-tools/SVF/blob/3038078e90eb037ea43aa28ff3a28c05d631be5d/include/MemoryModel/PTAType.h