Open EstebanOlmedo opened 1 year ago
We use proposals for API or language changes. I don't see a reason for this to be a proposal, so taking it out of the proposal process.
CC @thanm
@EstebanOlmedo it is an interesting idea, although potentially tricky to implement. FYI there is a research paper that gives an algorithm for doing this in a systematic way.
I am curious as to how much of the overhead from a "go build -coverage" run comes from the instrumentation code (e.g. counter updates) and how much is from the code that writes out the files (covdata and covcounters) for your use case.
The current building for coverage implementation works by rewriting the source files adding some instrumentation statements that later are used to determine the code lines that were executed during runtime, in the following code snippets we can see an example of this workflow.
There are 3 coverable units in the example above, but only two of them are needed to cover the whole main function in our sample program. By knowing the state of
x_0[4]
andx_0[5]
one can infer the state ofx_0[3]
:x_0[4]
has a value of 1, then x_0[3] must also have it, because it’s impossible to get to the block of code that x_0[4] covers without passing first through the code lines thatx_0[3]
covers.x_0[5]
.Following this assumption, one can see that some of the cover statements can be inferred from the value of others. One can know if a unit is inferable by generating the dominator tree out of the flow graph of the function (there’s going to be a one to one relationship between the coverable units and the nodes of the graph), if its corresponding node isn’t a leaf then the unit is inferable. In the figure below there’s the corresponding flow graph (first image) for the sample code and its dominator tree (second image).
Limitations
With this proposal a new issue arise when we have a noninstrumented package which calls
os.Exit()
as we’re erasing some of the cover statements based in the flow of execution of the instrumented function, there might be cases when the information dumped is incomplete and there’s no way to recover it. In the following code snippets there’s an example of this behavior, after building the main package erasing inferable units and executing the program, we’ll get as result that any unit was hit, when that’s not the case.Motivation
One of the purposes of this proposal is to reduce the overhead of runtime instrumentation to detect dead code across codebases by running instrumented binaries during a certain amount of time and then gathering a report of the executed code lines.