Open MrAlias opened 1 month ago
@MrAlias What do you think should be included in the prototype?
Ideally, something like this:
package main
import (
"fmt"
"time"
)
var Flag bool
func main() {
for {
fmt.Printf("Flag: %t", Flag)
time.Sleep(time.Second)
}
// Output: Flag: false
// Auto-instrumentation added
// Output: Flag: true
}
We may also need to look into having the global package call something like:
func (t *tracer) start(eBPFFlag *bool, ctx context.Context, name string, opts ...trace.SpanStartOption) (context.Context, trace.Span) {
/* ... */
}
and instrument that instead of Start
so we can find the flag location.
@MrAlias I did a POC for the second option we talked about. It is in https://github.com/open-telemetry/opentelemetry-go-instrumentation/compare/main...RonFed:opentelemetry-go-instrumentation_fork:flag_POC
Here are some logs from the app being instrumented and the agent:
rolldice-1 | 2024-08-09T15:42:14.291Z INFO app/main.go:64 starting http server {"port": ":8080"} rolldice-1 | 2024-08-09T15:42:14.291Z INFO app/main.go:52 probed function called with flag unset go-auto-1 | {"level":"info","ts":1723218134.4202611,"logger":"go.opentelemetry.io/auto","caller":"cli/main.go:86","msg":"building OpenTelemetry Go instrumentation ...","globalImpl":false} rolldice-1 | 2024-08-09T15:42:15.295Z INFO app/main.go:52 probed function called with flag unset rolldice-1 | 2024-08-09T15:42:16.297Z INFO app/main.go:52 probed function called with flag unset go-auto-1 | {"level":"info","ts":1723218136.4233193,"logger":"Instrumentation.Analyzer","caller":"process/discover.go:67","msg":"found process","pid":12030} go-auto-1 | {"level":"info","ts":1723218136.4287593,"logger":"Instrumentation.Allocate","caller":"process/allocate.go:73","msg":"Detaching from process","pid":12030} go-auto-1 | {"level":"info","ts":1723218136.4289303,"logger":"Instrumentation","caller":"app/instrumentation.go:144","msg":"target process analysis completed","pid":12030,"go_version":"1.22.5","dependencies":{"go.uber.org/multierr":"1.10.0","go.uber.org/zap":"1.27.0","std":"1.22.5"},"total_functions_found":3} go-auto-1 | {"level":"info","ts":1723218136.428993,"logger":"go.opentelemetry.io/auto","caller":"cli/main.go:119","msg":"starting instrumentation..."} go-auto-1 | {"level":"info","ts":1723218136.4319375,"logger":"Instrumentation.Manager","caller":"instrumentation/manager.go:198","msg":"loading probe","name":"net/http/server"} go-auto-1 | {"level":"info","ts":1723218136.4832294,"logger":"Instrumentation.Manager","caller":"instrumentation/manager.go:198","msg":"loading probe","name":"main/client"} go-auto-1 | {"level":"info","ts":1723218136.4842162,"logger":"go.opentelemetry.io/auto","caller":"cli/main.go:115","msg":"instrumentation loaded successfully"} rolldice-1 | 2024-08-09T15:42:17.299Z INFO app/main.go:50 probed function called with flag set rolldice-1 | 2024-08-09T15:42:18.300Z INFO app/main.go:50 probed function called with flag set rolldice-1 | 2024-08-09T15:42:19.301Z INFO app/main.go:50 probed function called with flag set rolldice-1 | 2024-08-09T15:42:20.302Z INFO app/main.go:50 probed function called with flag set rolldice-1 | 2024-08-09T15:42:21.303Z INFO app/main.go:50 probed function called with flag set rolldice-1 | 2024-08-09T15:42:22.304Z INFO app/main.go:50 probed function called with flag set rolldice-1 | 2024-08-09T15:42:23.305Z INFO app/main.go:50 probed function called with flag set rolldice-1 | 2024-08-09T15:42:24.307Z INFO app/main.go:50 probed function called with flag set rolldice-1 | 2024-08-09T15:42:25.308Z INFO app/main.go:50 probed function called with flag set rolldice-1 | 2024-08-09T15:42:26.309Z INFO app/main.go:50 probed function called with flag set
Awesome! Let me see about getting an issue created in https://github.com/open-telemetry/opentelemetry-go to make the proposal based on these findings.
opentelemetry-go proposal: https://github.com/open-telemetry/opentelemetry-go/issues/5702
@RonFed PTAL
@MrAlias LGTM,
In the suggested change, do you think we should probe the tracer.newSpan
(which has the flag pointer) or the interop.newSpan
(which has the configuration ready)?
Ideally, we should build it in a way that we can handle both in a single eBPF program if it is possible.
@MrAlias LGTM, In the suggested change, do you think we should probe the
tracer.newSpan
(which has the flag pointer) or theinterop.newSpan
(which has the configuration ready)? Ideally, we should build it in a way that we can handle both in a single eBPF program if it is possible.
I was thinking tracer.newSpan
as that would allow us to set the bool and trace in one probe. Right?
I was thinking tracer.newSpan as that would allow us to set the bool and trace in one probe. Right?
Yea, that sounds good, we'll need to have a return probe for that function as well simillar to what we have today - in order to save the returned span pointer in a map.
I was thinking tracer.newSpan as that would allow us to set the bool and trace in one probe. Right?
Yea, that sounds good, we'll need to have a return probe for that function as well simillar to what we have today - in order to save the returned span pointer in a map.
Correct.
With the custom sdk we're providing, do we still need the probes for the global instrumentation that we already have? Or could we just implement those functions in our sdk (Start, End, SetAttributes, etc)?
In that case we would only need to auto-instrument the custom sdk's ended
function so that the object gets passed to the auto-agent's export pipeline.
With the custom sdk we're providing, do we still need the probes for the global instrumentation that we already have? Or could we just implement those functions in our sdk (Start, End, SetAttributes, etc)?
In that case we would only need to auto-instrument the custom sdk's
ended
function so that the object gets passed to the auto-agent's export pipeline.
Correct. After we have opentelemetry-go use our SDK we won't need to instrument the nonRecordingSpan
or tracer
from go.openetelemetry.io/otel/internal/global
.
We will still need to instrument the initial call to tracer.Start
to set the bool
there stating the auto-instrumentation is registered.
We will still need to instrument the initial call to
tracer.Start
to set thebool
there stating the auto-instrumentation is registered.
Hmm, we could also look into just instrumenting a bool
in our SDK that the global SDK would look up via our API. Might be something we should look into.
Initial support for interoperability with manual instrumentation was added in https://github.com/open-telemetry/opentelemetry-go-instrumentation/pull/523. This approach has issues when trying to implement methods like
IsRecording
,RecordError
, andAddEvent
, as well as the inability to capture theSpanStartOptions
which contain important definitions about the span.Provide our own SDK
An alternate solution for this interoperability was proposed in https://github.com/open-telemetry/opentelemetry-go-instrumentation/issues/352#issuecomment-1811166718:
Providing an SDK implementation like this would allow us to design minimal overhead in the Go implementation to resolve all the issues found with the current approach. Namely:
IsRecording
, the span implementation could read its value from the spanstruct
field that could be set in the eBPF spaceRecordError
could call a method like `func (s span) recordError(errMsg, errName string, cfg trace.EventConfig). This method would have all the needed information for the probe already resolved.RecordError
,AddEvent
could call aaddEvent
method designed to have all the information for the probe already resolved in the call arguments.Interoperability
The remaining issues with providing a custom SDK, is how this would be integrated into users code?
The original idea was to offer this SDK as a package under
go.opentelemetry.io/auto
. This could still be provided, but if this is the only way this solution is integrated it means user code needs to be updated to inter-op with auto-instrumentation.Requiring user code to be updated to work with auto-instrumentation is not desired. The whole point of auto-instrumentation is this kind of action would not be required.
Instead, we would like to have the default global
otel
implementation inter-op with this new SDK:var
) would be addedgo.opentelemetry.io/otel/internal/global
.false
by defaulttrue
Tracer
implementation creates a new span it will check this added flag (i.e. here), and iftrue
will return our custom SDK implementation of theSpan
instead of thenonRecordingSpan
Action Items