Open quasilyte opened 6 years ago
Don't know who to CC. Maybe @josharian? (Just for issue validation; perhaps this is known and accepted limitation of the esc.go explainer.)
@dr2chase is primary for escape analysis (AFAIK); he wrote the escape analysis explainer. The other person that comes to mind is @cherrymui.
It seems to me a deeper issue is that the analysis of data flow is on the node level, where all occurrences of x
are represented with the same node. For example, if I remove the second sink=x
assignment, i.e.
package example
var sink *int
func fn() {
var x *int
x = new(int)
sink = x
x = new(int)
_ = x
}
The second new(int)
does not necessarily escape. With the current analysis, it does escape, because the new(int)
flows to x
, and x
, at some point, flows to heap. In the original program, it doesn't really matter when/where x
flows to heap; it just picks one.
For this, I think that the analysis would need to work on a level that tracks the order of data flow (probably some form of SSA). And more accurate location report would follow naturally.
The second new(int) does not necessarily escape. With the current analysis, it does escape, because the new(int) flows to x, and x, at some point, flows to heap. In the original program, it doesn't really matter when/where x flows to heap; it just picks one.
I actually tried to make exactly this, to teach escape analysis to recognize that second new(int)
is unnecessary, but encountered unexpected debug output that made me wonder.
Given this code:
(Code annotated with line number for convenience, minimal example outlined after the issue description.)
Execute the command
$ go tool compile -m=2 example.go
. The output on tip is (important bits are in bold):Expected output would not report second sink to the same location.
This is a consequence of how graph is constructed and printed.
Simplified example:
And the output:
Expected output:
For that example, we have something like this:
sink
pseudo node has 2 source edges, both comes fromx
node.x
node has 2 source edges, they come from the two differentnew(int)
During traversal in
escwalkBody
, bothnew(int)
paths are printed using firstsink
destination as a parent, so we get two same paths. For the second destination nothing is printed due toosrcesc
variable check that is used to avoid duplicated messages.If
osrcesc
is removed, both paths are printed twice (so, 4 messages instead of 2, but 2 of them are correct). Currently,osrcesc
leads to 2 messages, 1 of which is incorrect.It's not enough to just check whether destination node located before the actual starting flow point because of recursive functions:
Here the
return y
is a destination endpoint, and it comes before the tracked&x
.I have no good ideas on how to fix this one. Hopefully, insights above can help someone to roll the solution to this.