Closed kmnls closed 1 year ago
Nice catch! We should definitely fix this.
Since we don't just apply a limited set of hooks before performing coverage instrumentation but also let the user add custom hooks, patching an exhaustive list into JaCoCo doesn't seem feasible.
Could you check whether making coverage instrumentation the first rather than the last pass in https://github.com/CodeIntelligenceTesting/jazzer/blob/0420f57d2b03eb93fa52eea1aa306e33c68288e8/src/main/java/com/code_intelligence/jazzer/agent/RuntimeInstrumentor.kt#L221-L232 fixes the problem for your reproducer?
Background: Our coverage instrumentation used to emit relatively complicated byte code that would trigger the tracing instrumentation, but this is no longer the case - we emit a single method call that shouldn't trigger any of the other instrumentation passes.
It seems to me that all the user custom hooks are to intercept calls of some methods. BEFORE/AFTER/REPLACE are all toughly connected to the INVOKEs. The only different ones are those 'traces' which are bound to various opcodes Thus, there is no need to expect an extension of the list.
I'll check the reverse of the hooks/coverage application and inform you.
In my configuration for the exact project, from the first look, the proposed change of instrumentation order causes the same effect as the proposed patch. I have tried
private fun instrument(internalClassName: String, bytecode: ByteArray, fullInstrumentation: Boolean): ByteArray {
val classWithHooksEnabledField = if (Opt.conditionalHooks) {
// Let the hook instrumentation emit additional logic that checks the value of the
// hooksEnabled field on this class and skips the hook if it is false.
"com/code_intelligence/jazzer/runtime/JazzerInternal"
} else {
null
}
return ClassInstrumentor(internalClassName, bytecode).run {
if (fullInstrumentation) {
coverageIdSynchronizer.withIdForClass(internalClassName) { firstId ->
coverage(firstId).also { actualNumEdgeIds ->
CoverageRecorder.recordInstrumentedClass(
internalClassName,
bytecode,
firstId,
actualNumEdgeIds,
)
}
}
// Hook instrumentation must be performed after data flow tracing as the injected
// bytecode would trigger the GEP callbacks for byte[]. Coverage instrumentation
// must be performed after hook instrumentation as the injected bytecode would
// trigger the GEP callbacks for ByteBuffer.
traceDataFlow(instrumentationTypes)
hooks(includedHooks + customHooks, classWithHooksEnabledField)
} else {
hooks(customHooks, classWithHooksEnabledField)
}
instrumentedBytecode
}
}
@kmnls That looks promising. Could you submit a PR with this alternative approach? It does seem simpler and not having to patch JaCoCo more than absolutely necessary will help us maintain our dependencies.
See #711
Reason: JaCoCo produces wrong coverage because the coverage instrumentation happens after the hooks are set. This instrumentation should follow the JaCoCo algorithm but in fact they produce different results. Both algorithms inject control points into the end of each BB where the INVOKE happens. BBs where there are no INVOKEs are not marked. Let's check the case: BBs originally did not have any INVOKE but hooks are injected traceCmp callback. CoverageRecorder happily adds a control point into such a BB but JaCoCo will never know about this. As the result, all points after the mentioned will be misplaced in JaCoCo coverage.
Sample for test:
All the lines after
if (earlyReturn)
will be colored wrongly because this BB will have an additional coverage point after traceCmp.