open-telemetry / opentelemetry-go

OpenTelemetry Go API and SDK
https://opentelemetry.io/
Apache License 2.0
5.14k stars 1.04k forks source link

trace: Add custom span generator. #5725

Open tttoad opened 3 weeks ago

tttoad commented 3 weeks ago

I would like to support reuse of Span to reduce memory usage.

Support for setting up span generators.

type SpanInfo struct {
    Parent      trace.SpanContext
    SpanContext trace.SpanContext
    SpanKind    trace.SpanKind
    Name        string
    StartTime   time.Time
    SpanLimit   SpanLimits
    Tracer      trace.Tracer
}

type SpanGenerator interface {
    New(SpanInfo) ReadWriteSpan
}

func WithSpanGenerator(s SpanGenerator) TracerProviderOption {
    return traceProviderOptionFunc(func(cfg tracerProviderConfig) tracerProviderConfig {
        if s != nil {
            cfg.spanGenerator = s
        }
        return cfg
    })
}

func (s *RecordingSpan) ResetRecordingSpan(SpanInfo) *RecordingSpan {
    return &RecordingSpan{
        // ....
    }
}

type CustomSpan struct {
    trace.RecordingSpan
}

func (c *CustomSpan) End(opts ...tt.SpanEndOption) {
    c.RecordingSpan.End(opts...)
    P.Put(c)
}

func GetSpan(s trace.SpanInfo) trace.ReadWriteSpan {
    span := P.Get().(*CustomSpan)
    span.ResetRecordingSpan(s)
    return span
}

var P = sync.Pool{
    New: func() interface{} {
        return &CustomSpan{}
    },
}

For asynchronous tracking, new ParentSpan needs to be copied in CopyCtx(). In addition, Span's life cycle should end after the Span.End() method call. Exporter uses a snapshot of the span.

dmathieu commented 3 weeks ago

You should write a specification change before this can be part of the stable API.

MrAlias commented 3 weeks ago

For asynchronous tracking, new ParentSpan needs to be copied in CopyCtx(). In addition, Span's life cycle should end after the Span.End() method call. Exporter uses a snapshot of the span.

There is no guarantee that a span reference is released when it is ended. Your pool members are leaking.

If the core of this proposal is to use a pool of spans, this will be flawed.

tttoad commented 3 weeks ago

@MrAlias Guaranteed span release is achieved by the user. The user can guarantee that span is not used after span.end(), they can inject the span generator to achieve pool of spans. This issue just wants to support injecting a span generator and provide default span generation.

MrAlias commented 3 weeks ago

@MrAlias Guaranteed span release is achieved by the user. The user can guarantee that span is not used after span.end(), they can inject the span generator to achieve pool of spans. This issue just wants to support injecting a span generator and provide default span generation.

How does the user ensure things like the BatchSpanProcessor no longer holds a reference to the span?

tttoad commented 3 weeks ago

@MrAlias Guaranteed span release is achieved by the user. The user can guarantee that span is not used after span.end(), they can inject the span generator to achieve pool of spans. This issue just wants to support injecting a span generator and provide default span generation.

How does the user ensure things like the BatchSpanProcessor no longer holds a reference to the span?

In function span.End(), passed to BatchSpanProcessor is a snapshot of the span. https://github.com/open-telemetry/opentelemetry-go/blob/002c0a4c0352a56ebebc13f3ec20f73c23b348f6/sdk/trace/span.go#L432

MrAlias commented 3 weeks ago

@MrAlias Guaranteed span release is achieved by the user. The user can guarantee that span is not used after span.end(), they can inject the span generator to achieve pool of spans. This issue just wants to support injecting a span generator and provide default span generation.

How does the user ensure things like the BatchSpanProcessor no longer holds a reference to the span?

In function span.End(), passed to BatchSpanProcessor is a snapshot of the span.

https://github.com/open-telemetry/opentelemetry-go/blob/002c0a4c0352a56ebebc13f3ec20f73c23b348f6/sdk/trace/span.go#L432

How is allocating a snapshot in addition to the Span implementation more efficient?

MrAlias commented 3 weeks ago

@MrAlias Guaranteed span release is achieved by the user. The user can guarantee that span is not used after span.end(), they can inject the span generator to achieve pool of spans. This issue just wants to support injecting a span generator and provide default span generation.

How do you account for the cases where you do not manage the span? For example third-party instrumentation?

tttoad commented 3 weeks ago

@MrAlias保证 span 释放由用户实现。用户可以保证在 span.end() 之后不使用 span,他们可以注入 span 生成器来实现 span 池。本期只是想支持注入 span 生成器并提供默认 span 生成。

用户如何确保诸如BatchSpanProcessor不再持有对跨度的引用?

在函数中span.End(),传递给的BatchSpanProcessor是跨度的快照。 https://github.com/open-telemetry/opentelemetry-go/blob/002c0a4c0352a56ebebc13f3ec20f73c23b348f6/sdk/trace/span.go#L432

除了实现方式之外,如何分配快照才能Span更加高效?

We can also use pools for snapshots as well. Recycle the snapshots in the ExportSpans function.

type CustomExport struct {
    *stdouttrace.Exporter
}

// ExportSpans implements trace.SpanExporter.
func (c *CustomExport) ExportSpans(ctx context.Context, spans []trace.ReadOnlySpan) error {
    if err := c.Exporter.ExportSpans(ctx, spans); err != nil {
        return err
    }
    for _, span := range spans {
        if s, ok := span.(*pool.CustomSpan); ok {
            pool.P.Put(s)
        }
    }
    return nil
}
tttoad commented 3 weeks ago

@MrAlias Guaranteed span release is achieved by the user. The user can guarantee that span is not used after span.end(), they can inject the span generator to achieve pool of spans. This issue just wants to support injecting a span generator and provide default span generation.

How do you account for the cases where you do not manage the span? For example third-party instrumentation?

I have not used third-party instrumentation. Pool of spans can be implemented via the generator when the user can manage spans. That's just an option...

MrAlias commented 3 weeks ago

@MrAlias Guaranteed span release is achieved by the user. The user can guarantee that span is not used after span.end(), they can inject the span generator to achieve pool of spans. This issue just wants to support injecting a span generator and provide default span generation.

How do you account for the cases where you do not manage the span? For example third-party instrumentation?

I have not used third-party instrumentation. Pool of spans can be implemented via the generator when the user can manage spans. That's just an option...

The specific use case you are mentioning describes one where that specific user should implement their own SDK. Having such a specific and potentially dangerous design in the general purpose SDK contained here would be a misstep.

MrAlias commented 3 weeks ago

@MrAlias保证 span 释放由用户实现。用户可以保证在 span.end() 之后不使用 span,他们可以注入 span 生成器来实现 span 池。本期只是想支持注入 span 生成器并提供默认 span 生成。

用户如何确保诸如BatchSpanProcessor不再持有对跨度的引用?

在函数中span.End(),传递给的BatchSpanProcessor是跨度的快照。 https://github.com/open-telemetry/opentelemetry-go/blob/002c0a4c0352a56ebebc13f3ec20f73c23b348f6/sdk/trace/span.go#L432

除了实现方式之外,如何分配快照才能Span更加高效?

We can also use pools for snapshots as well. Recycle the snapshots in the ExportSpans function.

type CustomExport struct {
  *stdouttrace.Exporter
}

// ExportSpans implements trace.SpanExporter.
func (c *CustomExport) ExportSpans(ctx context.Context, spans []trace.ReadOnlySpan) error {
  if err := c.Exporter.ExportSpans(ctx, spans); err != nil {
      return err
  }
  for _, span := range spans {
      if s, ok := span.(*pool.CustomSpan); ok {
          pool.P.Put(s)
      }
  }
  return nil
}

Again, then how do you know you are done with the reference to the the snapshot?

How can the SDK, who owns the pool, guarantee the members of the pool are not leaked when they are passed to other entities?

tttoad commented 2 weeks ago

@MrAlias Guaranteed span release is achieved by the user. The user can guarantee that span is not used after span.end(), they can inject the span generator to achieve pool of spans. This issue just wants to support injecting a span generator and provide default span generation.

How do you account for the cases where you do not manage the span? For example third-party instrumentation?

I have not used third-party instrumentation. Pool of spans can be implemented via the generator when the user can manage spans. That's just an option...

The specific use case you are mentioning describes one where that specific user should implement their own SDK. Having such a specific and potentially dangerous design in the general purpose SDK contained here would be a misstep.

What is the potential dangerousness here? Span of user implementation? I think the WithSpanGenerator option is safe. I have done a requirement to store upstream service name in each span for service-map generation . With the span generator, I can read the service name in the parent span when generating the span.