CFG for destructors in '||'/'&&' lacks precision

llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

http://llvm.org

Other

29.05k stars 11.98k forks source link

CFG for destructors in '||'/'&&' lacks precision #12017

Closed tkremenek closed 6 years ago

tkremenek commented 12 years ago


Bugzilla Link	11645
Resolution	FIXED
Resolved on	Aug 21, 2018 11:43
Version	unspecified
OS	All
CC	@belkadan,@tkremenek,@labath,@haoNoQ,@xuzhongxing

Extended Description

The source-level CFG currently is very imprecise for modeling C++ destructors and '||'/'&&' short-circuit operations.

Consider:

$ cat /tmp/test.cpp int foo(); int bar();

class A { public: A(int x); ~A(); operator bool(); };

class B { public: B(int x); ~B(); operator bool(); };

class C { public: C(int x); ~C(); operator bool(); };

int test(int x, int y) { if (C b = A(x) || B(y)) return foo(); return bar(); }

Let's look at the CFG without destructors:

$ clang -cc1 -analyze -analyzer-checker=debug.DumpCFG /tmp/test.cpp [B6 (ENTRY)] Succs (1): B4

[B1] 1: bar 2: [B1.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void)) 3: [B1.2]() 4: return [B1.3]; Preds (1): B3 Succs (1): B0

[B2] 1: foo 2: [B2.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void)) 3: [B2.2]() 4: return [B2.3]; Preds (1): B3 Succs (1): B0

[B3] 1: [B4.8] || [B5.8] 2: [B3.1] (ImplicitCastExpr, IntegralCast, int) 3: [B3.2] (CXXConstructExpr, class C) 4: [B3.3] (ImplicitCastExpr, ConstructorConversion, class C) 5: [B3.4] (BindTemporary) 6: [B3.5] (ImplicitCastExpr, NoOp, const class C) 7: [B3.6] 8: C b = A(x).operator _Bool() || B(y).operator _Bool(); 9: b 10: [B3.9].operator _Bool 11: [B3.10]() 12: [B3.11] (ImplicitCastExpr, UserDefinedConversion, _Bool) T: if [B3.12] Preds (2): B5 B4 Succs (2): B2 B1

[B4] 1: x 2: [B4.1] (ImplicitCastExpr, LValueToRValue, int) 3: [B4.2] (CXXConstructExpr, class A) 4: [B4.3] (BindTemporary) 5: A([B4.4]) (CXXFunctionalCastExpr, ConstructorConversion, class A) 6: [B4.5].operator _Bool 7: [B4.6]() 8: [B4.7] (ImplicitCastExpr, UserDefinedConversion, _Bool) T: [B4.8] || ... Preds (1): B6 Succs (2): B3 B5

[B5] 1: y 2: [B5.1] (ImplicitCastExpr, LValueToRValue, int) 3: [B5.2] (CXXConstructExpr, class B) 4: [B5.3] (BindTemporary) 5: B([B5.4]) (CXXFunctionalCastExpr, ConstructorConversion, class B) 6: [B5.5].operator _Bool 7: [B5.6]() 8: [B5.7] (ImplicitCastExpr, UserDefinedConversion, _Bool) Preds (1): B4 Succs (1): B3

[B0 (EXIT)] Preds (2): B1 B2

This CFG behaves as expected. We have a single block with the '||' operation as a terminator (block 4), and a single block that merges the values from the LHS and RHS expressions (block B3) which is then terminated with the 'if' branch terminator. My motivation for looking at this CFG was to further optimize this so that block B5 was seen as dominating block B2. This would improve the quality of some flow-sensitive (but path-insensitive) analyses in the frontend and also simplify the CFG.

The CFG gets significantly more complicated (and wrong) when we add implicit destructors:

$ clang -cc1 -analyze -cfg-add-implicit-dtors -analyzer-checker=debug.DumpCFG /tmp/test.cpp

[B8 (ENTRY)] Succs (1): B6

[B1] 1: [B3.2].~C() (Implicit destructor) 2: bar 3: [B1.2] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void)) 4: [B1.3]() 5: return [B1.4]; Preds (1): B3 Succs (1): B0

[B2] 1: foo 2: [B2.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void)) 3: [B2.2]() 4: return [B2.3]; 5: [B3.2].~C() (Implicit destructor) Preds (1): B3 Succs (1): B0

[B3] 1: ~A() (Temporary object destructor) 2: C b = A(x).operator _Bool() || B(y).operator _Bool(); 3: b 4: [B3.3].operator _Bool 5: [B3.4]() 6: [B3.5] (ImplicitCastExpr, UserDefinedConversion, _Bool) T: if [B3.6] Preds (2): B4 B5 Succs (2): B2 B1

[B4] 1: ~B() (Temporary object destructor) Preds (1): B5 Succs (1): B3

[B5] 1: [B6.8] || [B7.8] 2: [B5.1] (ImplicitCastExpr, IntegralCast, int) 3: [B5.2] (CXXConstructExpr, class C) 4: [B5.3] (ImplicitCastExpr, ConstructorConversion, class C) 5: [B5.4] (BindTemporary) 6: [B5.5] (ImplicitCastExpr, NoOp, const class C) 7: [B5.6] 8: ~C() (Temporary object destructor) T: [B6.8] || ... Preds (2): B7 B6 Succs (2): B3 B4

[B6] 1: x 2: [B6.1] (ImplicitCastExpr, LValueToRValue, int) 3: [B6.2] (CXXConstructExpr, class A) 4: [B6.3] (BindTemporary) 5: A([B6.4]) (CXXFunctionalCastExpr, ConstructorConversion, class A) 6: [B6.5].operator _Bool 7: [B6.6]() 8: [B6.7] (ImplicitCastExpr, UserDefinedConversion, _Bool) T: [B6.8] || ... Preds (1): B8 Succs (2): B5 B7

[B7] 1: y 2: [B7.1] (ImplicitCastExpr, LValueToRValue, int) 3: [B7.2] (CXXConstructExpr, class B) 4: [B7.3] (BindTemporary) 5: B([B7.4]) (CXXFunctionalCastExpr, ConstructorConversion, class B) 6: [B7.5].operator _Bool 7: [B7.6]() 8: [B7.7] (ImplicitCastExpr, UserDefinedConversion, _Bool) Preds (1): B6 Succs (1): B5

[B0 (EXIT)] Preds (2): B1 B2

Notice that blocks B6 and B5 have the same terminator, which itself is suspect. The rationale is that the temporary for 'B(y)' should only be destructed when we evaluated the RHS of the logical operation. This is good enough for the static analyzer, but the CFG itself doesn't explicitly capture this dependency. This can be seen by observing that there is a path in the CFG from the entry block to the exit block where destructor ~B is called but the constructor B() is NOT called. While the static analyzer will likely get this right and recover that information (since it does enough value flow analysis to understand control-dependent branches), this unnecessarily drops a bunch of control-dependencies from the CFG which will degrade the quality of CFG-based analyses in the frontend.

The second, and more glaring problem, is that ~C is called twice along a path, both as an implicit destructor (in B1 and B2 respectively) and a destructor for a temporary (in B5). I may be missing something here, but this doesn't show up in the LLVM IR (which clearly shows a single destructor call to ~C).

Also, look at where the destructors are called. In block B1, ~C is correctly called before the call to bar(), but in block B2, the destructor appears after the return statement. Instead, it should appear immediately after the call to foo().

haoNoQ commented 6 years ago

We're doing pretty well here, i guess the CFG was fixed by Manuel Klimek in 2014.

I attached the current CFG.svg for easier reading.

The CFG contains all three temporary destructors and one automatic destructor. There's an extra correlated branch that controls conditional execution of ~B() which makes the CFG look like a triple-diamond instead of a double-diamond. The analyzer understands such branching it by remembering which temporaries were created in a special path-sensitive state trait.

Additionally, i made sure that copy elision is marked up in the CFG: the construct-expression on [B5.2] contains a forward reference to the future actual constructor on [B5.8], alongside with heads-up on future bind-temporary at [B5.5] and materialize-temporary at [B5.7]. Other constructors are not elidable, so they only have forward references to bind/materialize temporary (for A(x) and B(y)) and to the variable declaration (for 'C b'). The analyzer was taught to make use of this information.

There are other problems of this sort that were not addressed, eg. GNU binary ?: operator is broken, but && and || seem fine.

Additionally, complexity of the CFG seems to grow linearly, and every temporary destructor appears in the CFG exactly once; i didn't look deeply into this, but it seems to be the case. Automatic destructor for 'C b' in our example will appear twice (depending on which branch of the if-statement is taken), but this doesn't seem to be exploitable. I'm attaching a CFG-chained-destructors.svg in which the initializer for 'C b' is replaced with (A(1) && A(2)) || (A(3) && A(4)) || (A(5) && A(6)) || (A(7) && A(8)).

==================

Raw dump of the current CFG for easier comparing:

int test(int x, int y) [B8 (ENTRY)] Succs (1): B7

[B1] 1: [B5.9].~C() (Implicit destructor) 2: bar 3: [B1.2] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void)) 4: [B1.3]() 5: return [B1.4]; Preds (1): B3 Succs (1): B0

[B2] 1: foo 2: [B2.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void)) 3: [B2.2]() 4: return [B2.3]; 5: [B5.9].~C() (Implicit destructor) Preds (1): B3 Succs (1): B0

[B3] 1: ~A() (Temporary object destructor) 2: b 3: [B3.2].operator bool 4: [B3.2] 5: [B3.4] (ImplicitCastExpr, UserDefinedConversion, _Bool) T: if [B3.5] Preds (2): B4 B5 Succs (2): B2 B1

[B4] 1: ~B() (Temporary object destructor) Preds (1): B5 Succs (1): B3

[B5] 1: [B7.9] || [B6.9] 2: [B5.1] (ImplicitCastExpr, IntegralCast, int) 3: [B5.2] (CXXConstructExpr, [B5.5], [B5.7], [B5.8], class C) 4: [B5.3] (ImplicitCastExpr, ConstructorConversion, class C) 5: [B5.4] (BindTemporary) 6: [B5.5] (ImplicitCastExpr, NoOp, const class C) 7: [B5.6] 8: [B5.7] (CXXConstructExpr, [B5.9], class C) 9: C b = A(x) || B(y); 10: ~C() (Temporary object destructor) T: (Temp Dtor) [B6.4] Preds (2): B6 B7 Succs (2): B4 B3

[B6] 1: y 2: [B6.1] (ImplicitCastExpr, LValueToRValue, int) 3: [B6.2] (CXXConstructExpr, [B6.4], [B6.6], class B) 4: [B6.3] (BindTemporary) 5: B([B6.4]) (CXXFunctionalCastExpr, ConstructorConversion, class B) 6: [B6.5] 7: [B6.6].operator bool 8: [B6.6] 9: [B6.8] (ImplicitCastExpr, UserDefinedConversion, _Bool) Preds (1): B7 Succs (1): B5

[B7] 1: x 2: [B7.1] (ImplicitCastExpr, LValueToRValue, int) 3: [B7.2] (CXXConstructExpr, [B7.4], [B7.6], class A) 4: [B7.3] (BindTemporary) 5: A([B7.4]) (CXXFunctionalCastExpr, ConstructorConversion, class A) 6: [B7.5] 7: [B7.6].operator bool 8: [B7.6] 9: [B7.8] (ImplicitCastExpr, UserDefinedConversion, _Bool) T: [B7.9] || ... Preds (1): B8 Succs (2): B5 B6

[B0 (EXIT)] Preds (2): B1 B2

haoNoQ commented 6 years ago

CFG in which the initializer for 'C b' is replaced with (A(1) && A(2)) || (A(3) && A(4)) || (A(5) && A(6)) || (A(7) && A(8)).

haoNoQ commented 6 years ago

Current CFG

tkremenek commented 11 years ago

My back of the napkin calculations show that the amount of duplication of destructor sets should be linear if we want to expand all control flow. Consider:

e_1 && e_2 && ... && e_N

For each expression, e_i, we have a set of associated destructors, Destructors(e_i). The question is how many times will Destructors(e_i) be duplicated in the CFG?

Suppose we are looking at e_i, and we can take either the false branch or the true branch. If we take the true branch, we move on to evaluate e_i+1. If we take the false branch, we need to evaluate Destructors(e_1), Destructors(e_2), ... , Destructors(e_i), in reverse order.

Inductively, Destructors(e_1) will be evaluated in N places (or N+1, depending on how we want to handle the last expression, e_n). Similarly, Destructors(e_2) will be evaluated in N-1 places, Destructors(e_3) in N-2 places, etc. Assuming the cost of evaluating each destructor set is the same, we're looking at a linear series:

N + N - 1 + N - 2 + ... + 1

or:

n(n+1)/2

which is O(n^2).

Thus we are looking at quadratic complexity, in the worst case.

I do not believe this is true. I think you can easily get an exponential function if you alternate &&'s and ||'s a bit.

Suppose I just take two sets of expressions e1 && e2 && ... && en, and connect them with an ||. This way, you get O(n) ways that the first sub-expression can evaluate to false, and another O(n) for the second one. This results in O(n^2) code paths and each one contains O(n) destructors, so the CFG size already reaches O(n^3). If I put more of these subexpressions in the ||, I can easily reach any exponent I want...

You've convinced me. That's a valid argument. Thus the only way to model this efficiently would be to have separate branches for the destructors. Ideally groups of destructors that would be guaranteed to be executed together would be guarded under the same branch (if a branch is needed).

labath commented 11 years ago

My back of the napkin calculations show that the amount of duplication of destructor sets should be linear if we want to expand all control flow. Consider:

e_1 && e_2 && ... && e_N

For each expression, e_i, we have a set of associated destructors, Destructors(e_i). The question is how many times will Destructors(e_i) be duplicated in the CFG?

Suppose we are looking at e_i, and we can take either the false branch or the true branch. If we take the true branch, we move on to evaluate e_i+1. If we take the false branch, we need to evaluate Destructors(e_1), Destructors(e_2), ... , Destructors(e_i), in reverse order.

Inductively, Destructors(e_1) will be evaluated in N places (or N+1, depending on how we want to handle the last expression, e_n). Similarly, Destructors(e_2) will be evaluated in N-1 places, Destructors(e_3) in N-2 places, etc. Assuming the cost of evaluating each destructor set is the same, we're looking at a linear series:

N + N - 1 + N - 2 + ... + 1

or:

n(n+1)/2

which is O(n^2).

Thus we are looking at quadratic complexity, in the worst case.

I do not believe this is true. I think you can easily get an exponential function if you alternate &&'s and ||'s a bit.

Suppose I just take two sets of expressions e1 && e2 && ... && en, and connect them with an ||. This way, you get O(n) ways that the first sub-expression can evaluate to false, and another O(n) for the second one. This results in O(n^2) code paths and each one contains O(n) destructors, so the CFG size already reaches O(n^3). If I put more of these subexpressions in the ||, I can easily reach any exponent I want...

tkremenek commented 12 years ago

It seems to me that the right solution is to model the destructor "branches" in the CFG, and for analyses that care they will need to track the predicates of which constructors were called to properly match up with the feasibility of those branches.

This is something that will naturally be handled by the static analyzer, just like other predicates.

For analyses in the frontend, we should perhaps consider a generalized dataflow worklist that keeps predicates for some dataflow facts, for the analyses that need a small amount of path-sensitivity (e.g., -Wuninitialized).

tkremenek commented 12 years ago

Quadratic complexity is nothing the sneeze at. It isn't prohibitive, if N is small. For example, N == 5 leads to 5(5 + 1)/2 = 15. That's probably pretty reasonable. Larger N starts to get more questionable.

We could possibly take a hybrid approach, expanding for small N, and having a cleanup condition value for larger N. That would likely give higher precision to the 90% of cases.

Looks reasonable.

I'm not fully convinced: the drawback of a hybrid approach is that algorithms that use the CFG will now have to cope with 2 possible graphs, and thereby not simplifying them at all. Also, if N==5 is reasonable, then I think that would cover a lot more than 90%, but that's my subjective experience.

I don't see how there are 2 possible graphs that an algorithm would need to cope with. Essentially I see two constructs:

(1) A terminator (indicating to take a branch to conditionally execute a destructor) and a target basic block with a destructor call.

(2) Unrolled code where there is no terminator and there is a basic block with a destructor call.

Consider that (2) is just a special case of (1) where the branch is eliminated. I don't see how this complicates algorithms at all. Algorithms will need support (1) because that will be our fallback in the general case. (2) is just a case where the CFG is simplified/refined to not contain the branch.

My proposal of a hybrid approach doesn't complicate analysis algorithms at all. It just potentially makes them a bit smarter by having the CFG incorporate a bit of path-sensisitivity where it isn't too costly. I think this would directly benefit algorithms such as -Wuninitialized, etc., which themselves shouldn't be engineered to understand predicated logic if we can help it.

Jordan convinced me in person that duplicating the CFG is a bad idea, at least for tackling this problem. At the worst case we starting get exponential duplication of the CFG, as destructors can be separated between their constructors by large amounts. Consider:

const &A a = flag ? existing : A(); ... // lots of stuff with control flow destructor gets called, conditionally for A()

We wil need to model the in the CFG whether or not to call the destructor for the temporary. That's unavoidable. But duplicating everything between the constructor and destructor to elide that branch is infeasible in practice.

tkremenek commented 12 years ago

Quadratic complexity is nothing the sneeze at. It isn't prohibitive, if N is small. For example, N == 5 leads to 5(5 + 1)/2 = 15. That's probably pretty reasonable. Larger N starts to get more questionable.

We could possibly take a hybrid approach, expanding for small N, and having a cleanup condition value for larger N. That would likely give higher precision to the 90% of cases.

Looks reasonable.

I'm not fully convinced: the drawback of a hybrid approach is that algorithms that use the CFG will now have to cope with 2 possible graphs, and thereby not simplifying them at all. Also, if N==5 is reasonable, then I think that would cover a lot more than 90%, but that's my subjective experience.

I don't see how there are 2 possible graphs that an algorithm would need to cope with. Essentially I see two constructs:

(1) A terminator (indicating to take a branch to conditionally execute a destructor) and a target basic block with a destructor call.

(2) Unrolled code where there is no terminator and there is a basic block with a destructor call.

Consider that (2) is just a special case of (1) where the branch is eliminated. I don't see how this complicates algorithms at all. Algorithms will need support (1) because that will be our fallback in the general case. (2) is just a case where the CFG is simplified/refined to not contain the branch.

My proposal of a hybrid approach doesn't complicate analysis algorithms at all. It just potentially makes them a bit smarter by having the CFG incorporate a bit of path-sensisitivity where it isn't too costly. I think this would directly benefit algorithms such as -Wuninitialized, etc., which themselves shouldn't be engineered to understand predicated logic if we can help it.

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

Quadratic complexity is nothing the sneeze at. It isn't prohibitive, if N is small. For example, N == 5 leads to 5(5 + 1)/2 = 15. That's probably pretty reasonable. Larger N starts to get more questionable.

We could possibly take a hybrid approach, expanding for small N, and having a cleanup condition value for larger N. That would likely give higher precision to the 90% of cases.

Looks reasonable.

To clarify, I was saying that the fully expanded approach looks reasonable. I don't like the hybrid approach.

llvmbot commented 12 years ago

Quadratic complexity is nothing the sneeze at. It isn't prohibitive, if N is small. For example, N == 5 leads to 5(5 + 1)/2 = 15. That's probably pretty reasonable. Larger N starts to get more questionable.

We could possibly take a hybrid approach, expanding for small N, and having a cleanup condition value for larger N. That would likely give higher precision to the 90% of cases.

Looks reasonable.

I'm not fully convinced: the drawback of a hybrid approach is that algorithms that use the CFG will now have to cope with 2 possible graphs, and thereby not simplifying them at all. Also, if N==5 is reasonable, then I think that would cover a lot more than 90%, but that's my subjective experience.

That notwithstanding...:

Then the hard part is implementation. In the first try, I met with roughly three problems. (Consider x = ex1 op ex2 op ...op exn)

We don't know the number of branches the BinaryOperators would end up with.

Each of the components of the full expression should be unconditionally appended to each of the branch. And we need to mark the subexpr whose value should be taken as the outmost BinaryOperator's.

Different destructors are to be appended to difference branches.

One approach is to build the CFG as we currently do, and when that is done, "unfold" as many paths as wanted/needed. Another approach is to build the CFG recursively, and use a continuation-like object to specify what to generate for the successor blocks of an expression.

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

Quadratic complexity is nothing the sneeze at. It isn't prohibitive, if N is small. For example, N == 5 leads to 5(5 + 1)/2 = 15. That's probably pretty reasonable. Larger N starts to get more questionable.

We could possibly take a hybrid approach, expanding for small N, and having a cleanup condition value for larger N. That would likely give higher precision to the 90% of cases.

Looks reasonable. Then the hard part is implementation. In the first try, I met with roughly three problems. (Consider x = ex1 op ex2 op ...op exn)

We don't know the number of branches the BinaryOperators would end up with.
Each of the components of the full expression should be unconditionally appended to each of the branch. And we need to mark the subexpr whose value should be taken as the outmost BinaryOperator's.
Different destructors are to be appended to difference branches.

tkremenek commented 12 years ago

Quadratic complexity is nothing the sneeze at. It isn't prohibitive, if N is small. For example, N == 5 leads to 5(5 + 1)/2 = 15. That's probably pretty reasonable. Larger N starts to get more questionable.

We could possibly take a hybrid approach, expanding for small N, and having a cleanup condition value for larger N. That would likely give higher precision to the 90% of cases.

tkremenek commented 12 years ago

If it can be represented accurately in the LLVM IR at -O0, I think there shouldn't be any scalability concerns about representing this well in the CFG.

Clang CodeGen does what the current CFG does. It uses a "cleanup dest slot" or "cleanup condition value" to remember the branch that was taken, and switch on that value when it is time to do cleanups. So there is no control flow expansion in the LLVM IR.

Interesting. Good to know.

My back of the napkin calculations show that the amount of duplication of destructor sets should be linear if we want to expand all control flow. Consider:

e_1 && e_2 && ... && e_N

For each expression, e_i, we have a set of associated destructors, Destructors(e_i). The question is how many times will Destructors(e_i) be duplicated in the CFG?

Suppose we are looking at e_i, and we can take either the false branch or the true branch. If we take the true branch, we move on to evaluate e_i+1. If we take the false branch, we need to evaluate Destructors(e_1), Destructors(e_2), ... , Destructors(e_i), in reverse order.

Inductively, Destructors(e_1) will be evaluated in N places (or N+1, depending on how we want to handle the last expression, e_n). Similarly, Destructors(e_2) will be evaluated in N-1 places, Destructors(e_3) in N-2 places, etc. Assuming the cost of evaluating each destructor set is the same, we're looking at a linear series:

N + N - 1 + N - 2 + ... + 1

or:

n(n+1)/2

which is O(n^2).

Thus we are looking at quadratic complexity, in the worst case.

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

If it can be represented accurately in the LLVM IR at -O0, I think there shouldn't be any scalability concerns about representing this well in the CFG.

Clang CodeGen does what the current CFG does. It uses a "cleanup dest slot" or "cleanup condition value" to remember the branch that was taken, and switch on that value when it is time to do cleanups. So there is no control flow expansion in the LLVM IR.

tkremenek commented 12 years ago

Despite we agreed that having multiple occurrences of an Expr in the CFG is not a blocking problem, I found a few other concerns.

First having all control paths of a BinaryOperator of '&&' or '||' explicitly generated in the CFG might be not scalable. We would have branch numbers in proportion to the number of embedded '&&' or '||' in a full expr. (I haven't thought about if it is exponential.)

Second with the gneration of multiple branches of a BinaryOperator, we have to append different destructors to each branches, after the full expression is created in the CFG. This is not an easy problem to solve as I tried. Again we don't know how many branches a full expr with BinaryOperator may end up with.

That said, I still would like to have correct destructors in the CFG that are not dependant on the value of some flow control slot.

If it can be represented accurately in the LLVM IR at -O0, I think there shouldn't be any scalability concerns about representing this well in the CFG. The approach I would take is to draw this out, such as on graph paper or a whiteboard, and see where we can ideally optimize the expansion of a complex ||/&& expression. My gut is that this is going to be tricky, but that it still can be done. It seemed very tractable to reduce the size of the CFG for short circuit operations, and make the CFG more accurate, when one doesn't consider destructors. I think destructors reflect a bit of extra complexity, but I don't see how in worst case they wouldn't need to be duplicated more than once.

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

Despite we agreed that having multiple occurrences of an Expr in the CFG is not a blocking problem, I found a few other concerns.

First having all control paths of a BinaryOperator of '&&' or '||' explicitly generated in the CFG might be not scalable. We would have branch numbers in proportion to the number of embedded '&&' or '||' in a full expr. (I haven't thought about if it is exponential.)

Second with the gneration of multiple branches of a BinaryOperator, we have to append different destructors to each branches, after the full expression is created in the CFG. This is not an easy problem to solve as I tried. Again we don't know how many branches a full expr with BinaryOperator may end up with.

That said, I still would like to have correct destructors in the CFG that are not dependant on the value of some flow control slot.

tkremenek commented 12 years ago

A question for my understanding: should a ProgramPoint only get a CFGStmt, or also an Stmt or Expr*? For example, CheckObjCMessageContext passes an Expr to ProgramPoint::getProgramPoint(..), which can easily be a sub-expression. Or is each (sub-)expression already a separate CFGStmt in the CFG?

For the static analyzer, the CFG is linearized. This means that every Expr* (except a few things like ParenExprs) appear in the CFG as a CFGStmt.

If we made the change I proposed, we would likely need to move the frontend to using a linearized CFG as well. We need to evaluate the potential impact of this change to the frontend analyses.

When (and how) would you like to evaluate that? Or rephrasing: how would you like to proceed?

I am asking because to understand the impact of the changes you mentioned in comment #29, I did the changes to the ProgramPoint. However, to get it compiled, changes cascade down the whole list you mention in item (1).

Changing the frontend to use a linearized CFG is completely orthogonal to item (1). ProgramPoint is also only used by the analyzer. What I am proposing is:

(a) Make the changes to ProgramPoint, and also make the changes to the analyzer needed to pick up those changes. This is a big project, but it only impacts the static analyzer.

(b) Evaluate the compile-time cost of always linearizing the CFG in AnalysisBasedWarnings.cpp. If the cost can be made small, switch to always using a linearized CFG. Currently we only create a linearized CFG for the static analyzer, but not frontend warnings.

(c) Once we are using a linearized CFG, and the analyzer has been changed to use the updated ProgramPoint, we can consider modifying the CFG. (a) and (b) do not include any CFG changes at all.

llvmbot commented 12 years ago

A question for my understanding: should a ProgramPoint only get a CFGStmt, or also an Stmt or Expr*? For example, CheckObjCMessageContext passes an Expr to ProgramPoint::getProgramPoint(..), which can easily be a sub-expression. Or is each (sub-)expression already a separate CFGStmt in the CFG?

For the static analyzer, the CFG is linearized. This means that every Expr* (except a few things like ParenExprs) appear in the CFG as a CFGStmt.

If we made the change I proposed, we would likely need to move the frontend to using a linearized CFG as well. We need to evaluate the potential impact of this change to the frontend analyses.

When (and how) would you like to evaluate that? Or rephrasing: how would you like to proceed?

I am asking because to understand the impact of the changes you mentioned in comment #29, I did the changes to the ProgramPoint. However, to get it compiled, changes cascade down the whole list you mention in item (1).

tkremenek commented 12 years ago

A question for my understanding: should a ProgramPoint only get a CFGStmt, or also an Stmt or Expr*? For example, CheckObjCMessageContext passes an Expr to ProgramPoint::getProgramPoint(..), which can easily be a sub-expression. Or is each (sub-)expression already a separate CFGStmt in the CFG?

For the static analyzer, the CFG is linearized. This means that every Expr* (except a few things like ParenExprs) appear in the CFG as a CFGStmt.

If we made the change I proposed, we would likely need to move the frontend to using a linearized CFG as well. We need to evaluate the potential impact of this change to the frontend analyses.

tkremenek commented 12 years ago

A question for my understanding: should a ProgramPoint only get a CFGStmt, or also an Stmt or Expr*? For example, CheckObjCMessageContext passes an Expr to ProgramPoint::getProgramPoint(..), which can easily be a sub-expression. Or is each (sub-)expression already a separate CFGStmt in the CFG?

For the static analyzer, the CFG is linearized. This means that every Expr* (except a few things like ParenExprs) appear in the CFG as a CFGStmt.

llvmbot commented 12 years ago

A question for my understanding: should a ProgramPoint only get a CFGStmt, or also an Stmt or Expr*? For example, CheckObjCMessageContext passes an Expr to ProgramPoint::getProgramPoint(..), which can easily be a sub-expression. Or is each (sub-)expression already a separate CFGStmt in the CFG?

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

These steps look very reasonable.

I agree this is the goal, but this is only the next step if we don't care about modeling destructors correctly. The dependencies I see are:

(1) Convert the static analyzer to use CFGStmt instead of Stmt for ProgramPoints. This requires:

(

tkremenek commented 12 years ago

I'm excited about this direction, however. It really unlocks a ton of doors, and let's us do creative things in the CFG to solve a lot of correctness issues as well as possibly make basic flow-sensitive analyses smarter by having ways to enhance the base CFG with more path-sensitivity.

Yes! The next step is, for this code:

if (A && B) foo() else bar()

we should have CFG as:

B1: A A && ... succ: B2, B3

B2: B ... && B succ: B4, B3

B3: bar() succ: B5

B4: foo() succ: B5

B5: EXIT

We need a bit in CFGTerminator to indicate which sub-expr's value the terminator should take. This case we don't have the same Expr* appear in multiple CFGElements, but only in CFG teminators. For C++, we would have that case happen.

I agree this is the goal, but this is only the next step if we don't care about modeling destructors correctly. The dependencies I see are:

(1) Convert the static analyzer to use CFGStmt instead of Stmt for ProgramPoints. This requires:

(a) updating ProgramPoint (b) updating ExprEngine in every place it creates a node (c) updating CheckerContext to create nodes from CFGStmt (d) finally eliminating all recursive visitation in ExprEngine. With the introduction of the linearized CFG, we no longer needed to do recursive walking of the AST to generate nodes for subexpressions (since such nodes will already be visited). Most of this should be done, but some code will need to be killed off. This is a prerequisite because recursive visitation of subexpressions won't even be an option anymore once we move to using CFGStmt's for ProgramPoints.

(2) Updating all analyses that use ProgramPoints to use CFGStmt*'s instead (e.g., LiveVariables).

Once we change the high-level interface to the CFG without changing the core CFG internals, I am fine with taking a scalpel to the current representation of the CFG as we have been discussing in this PR, but not before. It's just too big of a change to not decouple into smaller pieces.

tkremenek commented 12 years ago

The main change we would need to make to the static analyzer is to have StmtPoint (subclass of ProgramPoint) use CFGStmt instead of Stmt. This has a domino effect of API changes. It means every time you create an ExplodedNode you need to have a CFGStmt instead of a Stmt to capture that extra bit of context of "where you are in the CFG". In the worst case, this means passing a CFGStmt (in addition to a specific Expr) to every place where a Stmt* is currently used to create an ExplodedNode (e.g., consider checker callbacks).

This one is true. Expr* itself cannot mark a unique position on the CFG.

This has a whole trickle of implications. We will need to file blocking PRs for those issues before making any CFG changes.

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

"This can be seen by observing that there is a path in the CFG from the entry block to the exit block where destructor ~B is called but the constructor B() is NOT called."

This is absolutely bogus. We should never construct a CFG where this is possible. This is really important for having dataflow analyses in the frontend do the right thing.

I agree. To solve this problem, we will have multiple CFGElements pointing to the same Stmt. But I don't see this as a problem. What problem would we have when multiple CFGElement pointing to the same Stmt?

There were two problems I thought of, but after more thinking I don't think 1 of them is an actual problem, and the other isn't a problem either other than it requires good API design.

The first problem was do we need to change Environment::getSVal()? For some reason, I thought we needed to change Environment to map from CFGStmts to SVals, instead of Stmts to SVals. If we did that, it would be problematic going from CFGStmts to "referenced" CFGStmts (e.g., child expressions) in order to query the value of subexpressions, etc. I then realized that this change is not necessary. If an Expr appears multiple times in a CFG, it just means that we can have multiple places in the CFG where an Expr's binding in the Environment can get set. That's no big deal. Also, analyses such as LiveVariables, UninitializedValues, etc., wouldn't need to be changed. An expression could just have multiple definitions for the associated values, but that wouldn't break any assumptions.

Yes. As long as we don't need to have multiple definitions for an Expr in a single "state", it is fine to have Expr appear multiple times in the CFG.

I guess CFGStmts with the same Stmt will appear in "parallel" CFG blocks, but not in "sequential" blocks. That means, the value for the same Stmt would get merged when the control flow merges.

If they appear in "sequential" blocks, we then might need to have multiple mappings for them in a single state. In this case, we need to change the mapping key to CFGStmt. But I haven't thought this could happen.

The main change we would need to make to the static analyzer is to have StmtPoint (subclass of ProgramPoint) use CFGStmt instead of Stmt. This has a domino effect of API changes. It means every time you create an ExplodedNode you need to have a CFGStmt instead of a Stmt to capture that extra bit of context of "where you are in the CFG". In the worst case, this means passing a CFGStmt (in addition to a specific Expr) to every place where a Stmt* is currently used to create an ExplodedNode (e.g., consider checker callbacks).

This one is true. Expr* itself cannot mark a unique position on the CFG.

I'm excited about this direction, however. It really unlocks a ton of doors, and let's us do creative things in the CFG to solve a lot of correctness issues as well as possibly make basic flow-sensitive analyses smarter by having ways to enhance the base CFG with more path-sensitivity.

Yes! The next step is, for this code:

if (A && B) foo() else bar()

we should have CFG as:

B1: A A && ... succ: B2, B3

B2: B ... && B succ: B4, B3

B3: bar() succ: B5

B4: foo() succ: B5

B5: EXIT

We need a bit in CFGTerminator to indicate which sub-expr's value the terminator should take. This case we don't have the same Expr* appear in multiple CFGElements, but only in CFG teminators. For C++, we would have that case happen.

tkremenek commented 12 years ago

We could also make it explicit that certain blocks are "clean-up blocks" and should only get executed when another preceding block was executed. This could be done by either marking the block, or marking the predecessor/successor edges. In the example with "if (C b = A(x) || B(y))" with destructors that would result in B3 and B4 being marked as clean-up, and the terminator for B5 would need to be changed to something like a "clean-up condition". That condition doesn't need to refer to [B6.8], because a path-sensitive pass will "know" if it executed B6 (well, it always will) and B7. I did not explore this very far, but my guess is that this might also help a lot with EH.

I disagree. The property I think we should aim for in the CFG is that cleanup block (if a separate block is needed) should unconditionally follow code that represents the logic that needs to be cleaned up. While we shouldn't aim to unroll all control-dependencies into the CFG, I think this one is fairly fundamental. If we make the cleanup blocks predicated, then that means every consumer of the CFG needs to understand that those blocks are predicated on some other block being "analyzed" (whatever that means for the particular analysis). At that point, we've pushed the complexity from the CFG construction to the individual analyses.

Scoped cleanup is a pretty fundamental concept that I just think we should "get right" in the CFG. I want the CFG to be useful to flow-sensitive analyses, not just path-sensitive ones.

tkremenek commented 12 years ago

Now I would think that the CFG implicitly captures the dependency: the AST Stmt pointers for both CFGElements would be the same, right?

Yes. We would have multiple CFGElements that reference the same Stmt. This would currently break an invariant that is used in the static analyzer. That said, we can fix that. Instead of using Stmt for analysis locations, we use CFGStmt (with the pointer being a slot in a CFGBlock). This would essentially be a (Stmt + context) that would be needed to distinguish multiple references of the a Stmt* in a CFG.

After posting this comment, in the background my mind started unwinding all the implications of such a change. While the changes to the CFG wouldn't be conceptually hard, the changes to clients would be huge. For example, in the static analyzer, how would Environment::getSVal() work? It could use a CFGStmt, but then how would we get a "child" CFGStmt from another one?

So say we introduce a CFGStmt, and multiple CFGStmt can hold the same Stmt. Then the context is the place of the CFGStmt in the CFG.

Yes, essentially. We already have a CFGStmt. It's just a pointer to an element in a CFGBlock. The idea is that we would use CFGStmt in ProgramPoint instead of Stmt*.

Would it then work for Environment::getSVal() when EnvironmentEntry is changed to have a CFGStmt instead of a Stmt? (I do not have any deep knowledge of the static analyser, so I can very well be wrong..)

This is where I got a bit confused, but I think we would just leave EnvironmentEntry using Stmt instead of CFGStmt. I think that's completely fine. It just means that an expression can have its value bound at multiple places in the CFG. If we think of Stmt's as just being "variable names" in the Environment, then we are just reusing the same variables to bind new values at different points in the CFG. I think this would work because there is never a case where we need to distinguish in the Environment a value bound to a Stmt at location X in the CFG from a value bound at location Y in the CFG.

tkremenek commented 12 years ago

"This can be seen by observing that there is a path in the CFG from the entry block to the exit block where destructor ~B is called but the constructor B() is NOT called."

This is absolutely bogus. We should never construct a CFG where this is possible. This is really important for having dataflow analyses in the frontend do the right thing.

I agree. To solve this problem, we will have multiple CFGElements pointing to the same Stmt. But I don't see this as a problem. What problem would we have when multiple CFGElement pointing to the same Stmt?

There were two problems I thought of, but after more thinking I don't think 1 of them is an actual problem, and the other isn't a problem either other than it requires good API design.

The first problem was do we need to change Environment::getSVal()? For some reason, I thought we needed to change Environment to map from CFGStmts to SVals, instead of Stmts to SVals. If we did that, it would be problematic going from CFGStmts to "referenced" CFGStmts (e.g., child expressions) in order to query the value of subexpressions, etc. I then realized that this change is not necessary. If an Expr appears multiple times in a CFG, it just means that we can have multiple places in the CFG where an Expr's binding in the Environment can get set. That's no big deal. Also, analyses such as LiveVariables, UninitializedValues, etc., wouldn't need to be changed. An expression could just have multiple definitions for the associated values, but that wouldn't break any assumptions.

The main change we would need to make to the static analyzer is to have StmtPoint (subclass of ProgramPoint) use CFGStmt instead of Stmt. This has a domino effect of API changes. It means every time you create an ExplodedNode you need to have a CFGStmt instead of a Stmt to capture that extra bit of context of "where you are in the CFG". In the worst case, this means passing a CFGStmt (in addition to a specific Expr) to every place where a Stmt* is currently used to create an ExplodedNode (e.g., consider checker callbacks).

I'm excited about this direction, however. It really unlocks a ton of doors, and let's us do creative things in the CFG to solve a lot of correctness issues as well as possibly make basic flow-sensitive analyses smarter by having ways to enhance the base CFG with more path-sensitivity.

llvmbot commented 12 years ago

Now I would think that the CFG implicitly captures the dependency: the AST Stmt pointers for both CFGElements would be the same, right?

Yes. We would have multiple CFGElements that reference the same Stmt. This would currently break an invariant that is used in the static analyzer. That said, we can fix that. Instead of using Stmt for analysis locations, we use CFGStmt (with the pointer being a slot in a CFGBlock). This would essentially be a (Stmt + context) that would be needed to distinguish multiple references of the a Stmt* in a CFG.

After posting this comment, in the background my mind started unwinding all the implications of such a change. While the changes to the CFG wouldn't be conceptually hard, the changes to clients would be huge. For example, in the static analyzer, how would Environment::getSVal() work? It could use a CFGStmt, but then how would we get a "child" CFGStmt from another one?

So say we introduce a CFGStmt, and multiple CFGStmt can hold the same Stmt. Then the context is the place of the CFGStmt in the CFG. Would it then work for Environment::getSVal() when EnvironmentEntry is changed to have a CFGStmt instead of a Stmt? (I do not have any deep knowledge of the static analyser, so I can very well be wrong..)

We could also make it explicit that certain blocks are "clean-up blocks" and should only get executed when another preceding block was executed. This could be done by either marking the block, or marking the predecessor/successor edges. In the example with "if (C b = A(x) || B(y))" with destructors that would result in B3 and B4 being marked as clean-up, and the terminator for B5 would need to be changed to something like a "clean-up condition". That condition doesn't need to refer to [B6.8], because a path-sensitive pass will "know" if it executed B6 (well, it always will) and B7. I did not explore this very far, but my guess is that this might also help a lot with EH.

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

In CodeGen, the Exprs are visited from outside in, the copy constructor is elided. The temporary object is constructed directly into the declared variable. So there is only one object actually constructed, and only one destructor needs to be called.

From the point of AST, I'd like to interpret it as follows. The points of object construction is two: the DeclStmt and the MaterializeTemporaryExpr. So two destructor calls is correct.

In static analysis, things get more complicated. Since we are evaluating AST Exprs inside out. Each time we see a non-elidable CXXConstructExpr, we conjure an object region. So for MaterializeTemporaryExpr we just propagate the sub-expr's value to it. And it does materialize an object. This is different from CodeGen, where the destination object is passed from outside to it. And this behavior is consistent with the behavior of CFG building, where each CXXBindTemporaryExpr introduces a temporary destructor call.

I'll need time to digest this, but I have two points.

First, the CFG is still not correct. In my above comment:

"This can be seen by observing that there is a path in the CFG from the entry block to the exit block where destructor ~B is called but the constructor B() is NOT called."

This is absolutely bogus. We should never construct a CFG where this is possible. This is really important for having dataflow analyses in the frontend do the right thing.

I agree. To solve this problem, we will have multiple CFGElements pointing to the same Stmt. But I don't see this as a problem. What problem would we have when multiple CFGElement pointing to the same Stmt?

Second, we should not be beholden to the ASTs exact representation of things to get good semantics in the CFG. The

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

In CodeGen, the Exprs are visited from outside in, the copy constructor is elided. The temporary object is constructed directly into the declared variable. So there is only one object actually constructed, and only one destructor needs to be called.

From the point of AST, I'd like to interpret it as follows. The points of object construction is two: the DeclStmt and the MaterializeTemporaryExpr. So two destructor calls is correct.

In static analysis, things get more complicated. Since we are evaluating AST Exprs inside out. Each time we see a non-elidable CXXConstructExpr, we conjure an object region. So for MaterializeTemporaryExpr we just propagate the sub-expr's value to it. And it does materialize an object. This is different from CodeGen, where the destination object is passed from outside to it. And this behavior is consistent with the behavior of CFG building, where each CXXBindTemporaryExpr introduces a temporary destructor call.

I'll need time to digest this, but I have two points.

First, the CFG is still not correct. In my above comment:

"This can be seen by observing that there is a path in the CFG from the entry block to the exit block where destructor ~B is called but the constructor B() is NOT called."

This is absolutely bogus. We should never construct a CFG where this is possible. This is really important for having dataflow analyses in the frontend do the right thing.

Second, we should not be beholden to the ASTs exact representation of things to get good semantics in the CFG. The CFG sh

tkremenek commented 12 years ago

Now I would think that the CFG implicitly captures the dependency: the AST Stmt pointers for both CFGElements would be the same, right?

Yes. We would have multiple CFGElements that reference the same Stmt. This would currently break an invariant that is used in the static analyzer. That said, we can fix that. Instead of using Stmt for analysis locations, we use CFGStmt (with the pointer being a slot in a CFGBlock). This would essentially be a (Stmt + context) that would be needed to distinguish multiple references of the a Stmt* in a CFG.

After posting this comment, in the background my mind started unwinding all the implications of such a change. While the changes to the CFG wouldn't be conceptually hard, the changes to clients would be huge. For example, in the static analyzer, how would Environment::getSVal() work? It could use a CFGStmt, but then how would we get a "child" CFGStmt from another one?

I still think it is worth considering how we can generalize the CFG infrastructure to come up with optimal solutions to problems like these.

tkremenek commented 12 years ago

Now I would think that the CFG implicitly captures the dependency: the AST Stmt pointers for both CFGElements would be the same, right?

Yes. We would have multiple CFGElements that reference the same Stmt. This would currently break an invariant that is used in the static analyzer. That said, we can fix that. Instead of using Stmt for analysis locations, we use CFGStmt (with the pointer being a slot in a CFGBlock). This would essentially be a (Stmt + context) that would be needed to distinguish multiple references of the a Stmt* in a CFG.

Making this change would involve changing a fair amount of code, but that change would be largely mechanical, and likely no more than a few hours of work. Doing so opens some doors with the CFG. The restriction that a Stmt* cannot appear more than once in a CFG limits us in many ways. For instance, it falsely limits us in considering possible solutions to the problem in this PR. Second, it prevents us from doing some intelligent transformations to a CFG, e.g., creating a "predicated CFG", where we introduce duplication in the CFG in order to provide some basic path-sensitivity for ordinary flow-sensitive analyses. For example, imagine that -Wuninitialized could use a "predicated CFG" that has just the basic path-sensitive modifications that it needs to do a good job, and -Wuninitialized (and other frontend warnings) can remain with simple implementations that implement essentially the textbook definition of those analyses instead of something that is overly complicated (for instance, the implementation of -Wuninitialized tries to retrofit some path-sensitivity into the analysis, but it doesn't do a very good job and that hack is local to that analysis).

tkremenek commented 12 years ago

Anyway, eliminating this terminator duplication would lead to CFGElement duplication: the construction of C would be duplicated in every path. This can make other types of analyses (e.g. inlining constructors) quite expensive. I also think that merging the paths at some point (e.g. for code generation) will also be expensive.

We would never inline constructors into the CFG. There is never a reason to do that. Analyses that would want to simulate inlining would just need to construct multiple CFGs and use them together.

You are right that some duplication may occur, but that is no worse than what we already do in the LLVM IR.

tkremenek commented 12 years ago

In CodeGen, the Exprs are visited from outside in, the copy constructor is elided. The temporary object is constructed directly into the declared variable. So there is only one object actually constructed, and only one destructor needs to be called.

From the point of AST, I'd like to interpret it as follows. The points of object construction is two: the DeclStmt and the MaterializeTemporaryExpr. So two destructor calls is correct.

In static analysis, things get more complicated. Since we are evaluating AST Exprs inside out. Each time we see a non-elidable CXXConstructExpr, we conjure an object region. So for MaterializeTemporaryExpr we just propagate the sub-expr's value to it. And it does materialize an object. This is different from CodeGen, where the destination object is passed from outside to it. And this behavior is consistent with the behavior of CFG building, where each CXXBindTemporaryExpr introduces a temporary destructor call.

I'll need time to digest this, but I have two points.

First, the CFG is still not correct. In my above comment:

"This can be seen by observing that there is a path in the CFG from the entry block to the exit block where destructor ~B is called but the constructor B() is NOT called."

This is absolutely bogus. We should never construct a CFG where this is possible. This is really important for having dataflow analyses in the frontend do the right thing.

Second, we should not be beholden to the ASTs exact representation of things to get good semantics in the CFG. The CFG should reflect the semantics we to structure with control-flow. Using Stmt*'s for CFGElements is one way to do this, but there is no reason we cannot create new CFGElements to more accurately reflect the semantics for constructors, etc.

It's worth noting that the CFGBuilder also visits expressions from the outside-to-in, just like CodeGen. This means that CFGBuilder can capture additional context needed to properly render semantics into the CFG (e.g., eliding the copy constructor). We use to pass more information down in the visitor when the lvalue-to-rvalue semantics were not properly captured in the AST. When we introduced ImplicitCastExprs for lvalue-to-rvalue conversions, we were able to remove the need to pass this context around, but I see no reason why the CFGBuilder can just "do the right thing" when building the CFG in the first place.

As for the static analyzer, it isn't strictly evaluating Exprs from inside-to-out, it's evaluating them to how they were ordered in the CFG. We are free to introduce new constructs in the CFG to make the analysis semantics easier to reason about.

The CFG should not be viewed strictly as a shim on top of the AST. It should be viewed as a high-level IR, which consists of AST nodes as well as other constructs, to faithfully represent the control-flow of the semantics of a function.

llvmbot commented 12 years ago

Going back to the terminators for B5 and B6..:

Notice that blocks B6 and B5 have the same terminator, which itself is suspect. The rationale is that the temporary for 'B(y)' should only be destructed when we evaluated the RHS of the logical operation. This is good enough for the static analyzer, but the CFG itself doesn't explicitly capture this dependency. This can be seen by observing that there is a path in the CFG from the entry block to the exit block where destructor ~B is called but the constructor B() is NOT called. While the static analyzer will likely get this right and recover that information (since it does enough value flow analysis to understand control-dependent branches), this unnecessarily drops a bunch of control-dependencies from the CFG which will degrade the quality of CFG-based analyses in the frontend.

I get your point in why this would be problematic if the result of the condition is not taken into account. Now I would think that the CFG implicitly captures the dependency: the AST Stmt pointers for both CFGElements would be the same, right?

Anyway, eliminating this terminator duplication would lead to CFGElement duplication: the construction of C would be duplicated in every path. This can make other types of analyses (e.g. inlining constructors) quite expensive. I also think that merging the paths at some point (e.g. for code generation) will also be expensive.

So I'm not sure if we should eliminate the duplicate terminator.

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

(IfStmt 0x39d87c0 <line:26:3, line:27:16> (DeclStmt 0x39d87f0 <line:26:7, col:24> 0x39d7ba0 "C b = (ExprWithCleanups 0x39d85b8 <col:9, col:24> 'class C' (CXXConstructExpr 0x39d8580 <col:9, col:24> 'class C''void (const class C &) throw()' elidable (MaterializeTemporaryExpr 0x39d8568 <col:13, col:24> 'const class C' lvalue (ImplicitCastExpr 0x39d8550 <col:13, col:24> 'const class C' (CXXBindTemporaryExpr 0x39d8518 <col:13, col:24> 'class C' (CXXTemporary 0x39d8510) (ImplicitCastExpr 0x39d84f0 <col:13, col:24> 'class C' (CXXConstructExpr 0x39d84b8 <col:13, col:24> 'class C''void (int)' (ImplicitCastExpr 0x39d84a0 <col:13, col:24> 'int' (BinaryOperator 0x39d82b8 <col:13, col:24> '_Bool' '||' ......

Theoretically, there are two objects constructed during the evaluation of DeclStmt of "c". One is the declared variable, the other is the temporary object MaterializeTemporaryExpr, which is copied into the declared variable.

In CodeGen, the Exprs are visited from outside in, the copy constructor is elided. The temporary object is constructed directly into the declared variable. So there is only one object actually constructed, and only one destructor needs to be called.

From the point of AST, I'd like to interpret it as follows. The points of object construction is two: the DeclStmt and the MaterializeTemporaryExpr. So two destructor calls is correct.

In static analysis, things get more complicated. Since we are evaluating AST Exprs inside out. Each time we see a non-elidable CXXConstructExpr, we conjure an object region. So for MaterializeTemporaryExpr we just propagate the sub-expr's value to it. And it does materialize an object. This is different from CodeGen, where the destination object is passed from outside to it. And this behavior is consistent with the behavior of CFG building, where each CXXBindTemporaryExpr introduces a temporary destructor call.

For the elidable CXXConstructExpr, we do not conjure an object. Then when we see the DeclStmt, we bind the only conjured object to it.

Plus the implicit destructor for the declared variable "c", we have two destructor calls. This is consistent with the number of objects we created.

If the copy constructor is not elidable, we would have an extra object created. I don't have a suitable way to deal with case yet.

So my point is that the current CFG is correct with respect to the number of destructors. And the linearized CFG poses difficulties for CXXConstructExpr evaluation. But the problem is not that serious, we just have extra objects constructed.

tkremenek commented 12 years ago

Created attachment 7881 [details] Put destructor calls before return.

As posted on cfe-commits: the patch puts the destructor calls before the return statement.

This looks good to me, but I'd be more comfortable if the test case was expanded to include a case where the return expression creates a temporary with a destructor, e.g.:

return A();

My concern is that maybe the CFG is already "correct", and we need to adjust our interpretation that "return" isn't necessarily the last thing in a CFGBlock.

Yeah the CFG is already correct, and the ReturnStmt is not treated as a terminator in the CFG.

Just to be clear, the CFG isn't correct in its entirety; you are just referring to the part with the destructors after the ReturnStmt?

Yes. I am referring to this part.

Which part are you thinking is still not correct? Are you referring to the removing of the extra joint point for "&&" and "||" operators in the CFG? Do we need to open another bugzilla entry for that?

I am referring specifically to this part:

"The second, and more glaring problem, is that ~C is called twice along a path, both as an implicit destructor (in B1 and B2 respectively) and a destructor for a temporary (in B5)."

This comment also points to a point of imprecision that is bad for analyses in the frontend, and thus should be considered a correctness issue:

"This is good enough for the static analyzer, but the CFG itself doesn't explicitly capture this dependency."

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

Created attachment 7881 [details] Put destructor calls before return.

As posted on cfe-commits: the patch puts the destructor calls before the return statement.

This looks good to me, but I'd be more comfortable if the test case was expanded to include a case where the return expression creates a temporary with a destructor, e.g.:

return A();

My concern is that maybe the CFG is already "correct", and we need to adjust our interpretation that "return" isn't necessarily the last thing in a CFGBlock.

Yeah the CFG is already correct, and the ReturnStmt is not treated as a terminator in the CFG.

Just to be clear, the CFG isn't correct in its entirety; you are just referring to the part with the destructors after the ReturnStmt?

Yes. I am referring to this part.

Which part are you thinking is still not correct? Are you referring to the removing of the extra joint point for "&&" and "||" operators in the CFG? Do we need to open another bugzilla entry for that?

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

Created attachment 7881 [details] Put destructor calls before return.

As posted on cfe-commits: the patch puts the destructor calls before the return statement.

This looks good to me, but I'd be more comfortable if the test case was expanded to include a case where the return expression creates a temporary with a destructor, e.g.:

return A();

My concern is that maybe the CFG is already "correct", and we need to adjust our interpretation that "return" isn't necessarily the last thing in a CFGBlock.

Yeah the CFG is already correct, and the ReturnStmt is not treated as a terminator in the CFG.

Just to be clear, the CFG isn't correct in its entirety; you are just referring to the part with the destructors after the ReturnStmt?

Yes. I am referring to this part.

tkremenek commented 12 years ago

Created attachment 7881 [details] Put destructor calls before return.

As posted on cfe-commits: the patch puts the destructor calls before the return statement.

This looks good to me, but I'd be more comfortable if the test case was expanded to include a case where the return expression creates a temporary with a destructor, e.g.:

return A();

My concern is that maybe the CFG is already "correct", and we need to adjust our interpretation that "return" isn't necessarily the last thing in a CFGBlock.

Yeah the CFG is already correct, and the ReturnStmt is not treated as a terminator in the CFG.

Just to be clear, the CFG isn't correct in its entirety; you are just referring to the part with the destructors after the ReturnStmt?

tkremenek commented 12 years ago

Also, look at where the destructors are called. In block B1, ~C is correctly called before the call to bar(), but in block B2, the destructor appears after the return statement. Instead, it should appear immediately after the call to foo().

Following on this original point, I think this part of the CFG could be interpreted as being correct already (I'm countering what I said before).

tkremenek commented 12 years ago

Created attachment 7881 [details] Put destructor calls before return.

As posted on cfe-commits: the patch puts the destructor calls before the return statement.

This looks good to me, but I'd be more comfortable if the test case was expanded to include a case where the return expression creates a temporary with a destructor, e.g.:

return A();

My concern is that maybe the CFG is already "correct", and we need to adjust our interpretation that "return" isn't necessarily the last thing in a CFGBlock.

Yeah the CFG is already correct, and the ReturnStmt is not treated as a terminator in the CFG.

Exactly. If we treat an occurrence of "return" in the CFG is meaning "bind an expression result for the return value" and not as a transfer of control then it is is fine for the destructors to appear after the "return". From this view, the transfer back to the caller is when we hit the Exit block.

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

Created attachment 7881 [details] Put destructor calls before return.

As posted on cfe-commits: the patch puts the destructor calls before the return statement.

This looks good to me, but I'd be more comfortable if the test case was expanded to include a case where the return expression creates a temporary with a destructor, e.g.:

return A();

My concern is that maybe the CFG is already "correct", and we need to adjust our interpretation that "return" isn't necessarily the last thing in a CFGBlock.

Yeah the CFG is already correct, and the ReturnStmt is not treated as a terminator in the CFG.

tkremenek commented 12 years ago

Created attachment 7881 [details] Put destructor calls before return.

As posted on cfe-commits: the patch puts the destructor calls before the return statement.

This looks good to me, but I'd be more comfortable if the test case was expanded to include a case where the return expression creates a temporary with a destructor, e.g.:

return A();

My concern is that maybe the CFG is already "correct", and we need to adjust our interpretation that "return" isn't necessarily the last thing in a CFGBlock.

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

Created attachment 7881 [details] Put destructor calls before return.

As posted on cfe-commits: the patch puts the destructor calls before the return statement.

The patch looks good to me.

After this patch, from the point of view of the analyzer, the CFG is "correct" now. But removing the extra joint points is still the way to go.

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

The second, and more glaring problem, is that ~C is called twice along a path, both as an implicit destructor (in B1 and B2 respectively) and a destructor for a temporary (in B5). I may be missing something here, but this doesn't show up in the LLVM IR (which clearly shows a single destructor call to ~C).

The constructor for the declared "C b" is elidable, so that the temporary is constructed directly in "b". I believe there should be only one ~C() call in the CFG along any path.

As a second thought, it might be correct to have destructors for elidable CXXConstructExprs to appear in the CFG, and let the clients decide how to deal with them.

llvmbot commented 12 years ago

Put destructor calls before return. As posted on cfe-commits: the patch puts the destructor calls before the return statement.

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

The second, and more glaring problem, is that ~C is called twice along a path, both as an implicit destructor (in B1 and B2 respectively) and a destructor for a temporary (in B5). I may be missing something here, but this doesn't show up in the LLVM IR (which clearly shows a single destructor call to ~C).

The constructor for the declared "C b" is elidable, so that the temporary is constructed directly in "b". I believe there should be only one ~C() call in the CFG along any path.

d4e15ef9-ffb5-4988-bc6f-d8a3b3e1d54a commented 12 years ago

This CFG behaves as expected. We have a single block with the '||' operation as a terminator (block 4), and a single block that merges the values from the LHS and RHS expressions (block B3) which is then terminated with the 'if' branch terminator. My motivation for looking at this CFG was to further optimize this so that block B5 was seen as dominating block B2.

Do you mean "block B5 was seen as dominating block B1"?