open-telemetry / opentelemetry-go

OpenTelemetry Go API and SDK
https://opentelemetry.io/docs/languages/go
Apache License 2.0
5.28k stars 1.08k forks source link

OTLP log export duplicates logs across all present resources #5782

Closed vito closed 1 month ago

vito commented 2 months ago

Description

When I call Export with a []log.Record batch that includes records from multiple different resources, it includes all logs for every resource, resulting in duplicates and incorrect data.

Seems to be a simple error in code - this line just passes the full set of records along to ScopeLogs:

https://github.com/open-telemetry/opentelemetry-go/blob/1b8f785c02521efbee5141b3dd8577b4893a1569/exporters/otlp/otlplog/otlploggrpc/internal/transform/log.go#L65

I suppose it should either send only the subset for the current resource, or check against the resource Equivalent() as it populates the per-scope batches.

Environment

(I'm aware this is out of date but afaict the issue exists on master:)

        go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.52.0
        go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.52.0
        go.opentelemetry.io/otel v1.27.0
        go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp v0.3.0
        go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.27.0 // indirect
        go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.27.0
        go.opentelemetry.io/otel/log v0.3.0
        go.opentelemetry.io/otel/sdk v1.27.0
        go.opentelemetry.io/otel/sdk/log v0.3.0
        go.opentelemetry.io/otel/sdk/metric v1.27.0 // indirect
        go.opentelemetry.io/otel/trace v1.27.0
        go.opentelemetry.io/proto/otlp v1.3.1
        go.opentelemetry.io/contrib/instrumentation/net/http/httptrace/otelhttptrace v0.52.0 // indirect
        go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc v0.0.0-20240524160931-7c5e64cccc16 // indirect
        go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.27.0 // indirect
        go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.27.0 // indirect
        go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.27.0 // indirect
        go.opentelemetry.io/otel/metric v1.27.0 // indirect

Steps To Reproduce

Sorry this is verbal, I can try to put a code example together but figured the linked code would make it seem obvious!

Expected behavior

I would expect the OTLP payload to divide the records across the resources. Instead, all records are sent for all resources.

So [A1, A2, B3, B4] ends up as [A: [1, 2, 3, 4], B: [1, 2, 3, 4]} when instead it should be {A: [1, 2], B: [3, 4]}.

Frapschen commented 2 months ago

@vito Please put a set of log records that can reproduce this bug, thanks!

Frapschen commented 2 months ago

We can refer to this https://github.com/open-telemetry/opentelemetry-go/blob/1b8f785c02521efbee5141b3dd8577b4893a1569/exporters/otlp/otlptrace/internal/tracetransform/span.go#L19 to fix it. This function is more readable.

pree-dew commented 2 months ago

@pellared @Frapschen @vito Can I pick this issue if it's okay with you all?

pree-dew commented 2 months ago

Able to reproduce

otel-collector      | Resource SchemaURL: https://opentelemetry.io/schemas/1.4.0
otel-collector      | Resource attributes:
otel-collector      |      -> service.name: Str(service3)
otel-collector      | ScopeLogs #0
otel-collector      | ScopeLogs SchemaURL:
otel-collector      | InstrumentationScope
otel-collector      | LogRecord #0
otel-collector      | ObservedTimestamp: 1970-01-01 00:00:00 +0000 UTC
otel-collector      | Timestamp: 2024-09-08 19:50:00.278274 +0000 UTC
otel-collector      | SeverityText:
otel-collector      | SeverityNumber: Unspecified(0)
otel-collector      | Body: Str(service3 body=====================)
otel-collector      | Attributes:
otel-collector      |      -> apple: Str()
otel-collector      | Trace ID:
otel-collector      | Span ID:
otel-collector      | Flags: 0
otel-collector      | LogRecord #1
otel-collector      | ObservedTimestamp: 1970-01-01 00:00:00 +0000 UTC
otel-collector      | Timestamp: 2024-09-08 19:50:00.278329 +0000 UTC
otel-collector      | SeverityText:
otel-collector      | SeverityNumber: Unspecified(0)
otel-collector      | Body: Str(service4 body=====================)
otel-collector      | Attributes:
otel-collector      |      -> apple: Str()
otel-collector      | Trace ID:
otel-collector      | Span ID:
otel-collector      | Flags: 0
otel-collector      | ResourceLog #1
otel-collector      | Resource SchemaURL: https://opentelemetry.io/schemas/1.4.0
otel-collector      | Resource attributes:
otel-collector      |      -> service.name: Str(service4)
otel-collector      | ScopeLogs #0
otel-collector      | ScopeLogs SchemaURL:
otel-collector      | InstrumentationScope
otel-collector      | LogRecord #0
otel-collector      | ObservedTimestamp: 1970-01-01 00:00:00 +0000 UTC
otel-collector      | Timestamp: 2024-09-08 19:50:00.278274 +0000 UTC
otel-collector      | SeverityText:
otel-collector      | SeverityNumber: Unspecified(0)
otel-collector      | Body: Str(service3 body=====================)
otel-collector      | Attributes:
otel-collector      |      -> apple: Str()
otel-collector      | Trace ID:
otel-collector      | Span ID:
otel-collector      | Flags: 0
otel-collector      | LogRecord #1
otel-collector      | ObservedTimestamp: 1970-01-01 00:00:00 +0000 UTC
otel-collector      | Timestamp: 2024-09-08 19:50:00.278329 +0000 UTC
otel-collector      | SeverityText:
otel-collector      | SeverityNumber: Unspecified(0)
otel-collector      | Body: Str(service4 body=====================)
otel-collector      | Attributes:
otel-collector      |      -> apple: Str()

Here: service3 is also showing record for service4 and vice versa.

Will start work on this now.

pellared commented 1 month ago

I have a question regarding the steps to reproduce from the issue description

Call Export with a []log.Record with a heterogeneous set of resources.

Is there a possibility to configure the OTel Go Logs SDK that such call is possible? The logs providers is associated with a single instance of a resource.

@pree-dew], can you share the code snippet that reproduces the issue?

pree-dew commented 1 month ago

@pellared I reproduced it by implementing a small function that might be just taking care of exporting logs

func exportLogsHandler(w http.ResponseWriter, r *http.Request) {
    var out []sdklog.Record

    // Create and configure the first log record
    r1 := sdklog.Record{}
    r1.SetTimestamp(time.Now())
    r1.SetAttributes(log.KeyValue{Key: "apple", Value: log.StringValue("service3")})
    r1.SetBody(log.StringValue("service3 body====================="))
    setResource(&r1, "resource", resource.NewWithAttributes(
        semconv.SchemaURL,
        attribute.KeyValue{
            Key:   "service.name",
            Value: attribute.StringValue("service3"),
        },
    ))

    out = append(out, r1)

    // Create and configure the second log record
    r2 := sdklog.Record{}
    r2.SetTimestamp(time.Now())
    r2.SetAttributes(log.KeyValue{Key: "apple", Value: log.StringValue("service4")})
    r2.SetBody(log.StringValue("service4 body====================="))
    setResource(&r2, "resource", resource.NewWithAttributes(
        semconv.SchemaURL,
        attribute.KeyValue{
            Key:   "service.name",
            Value: attribute.StringValue("service4"),
        },
    ))
    out = append(out, r2)

    // Export the logs
    if err := otelExp.Export(r.Context(), out); err != nil {
        http.Error(w, fmt.Sprintf("Failed to export logs: %v", err), http.StatusInternalServerError)
        return
    }

    w.WriteHeader(http.StatusOK)
    w.Write([]byte("Logs exported successfully"))
}

Let me know if I am missing something here.

pellared commented 1 month ago

@pree-dew, users are expected to emit logs using this API: https://pkg.go.dev/go.opentelemetry.io/otel/log (or a bridge using it).

The SDK is not designed to create https://pkg.go.dev/go.opentelemetry.io/otel/sdk/log#Record themselves. The record is created by the Logger returned by https://pkg.go.dev/go.opentelemetry.io/otel/sdk/log#LoggerProvider.Logger.

EDIT: I created https://github.com/open-telemetry/opentelemetry-go/issues/5823

pree-dew commented 1 month ago

@pellared My assumptions might be incorrect in that case but I assumed that there might be libraries or tools that just want to export otel compatible logs when the actual logs are received in different format from the base library, might be using it in this way. If that is not the case, then let me know how can I help in documenting this behaviour as it will help others.

pree-dew commented 1 month ago

@pellared , @vito has added a reference in the issue where they are using Record directly, https://github.com/dagger/dagger/pull/8390, it needs correction there also if it is not preferred.

pellared commented 1 month ago

t I assumed that there might be libraries or tools that just want to export otel compatible logs when the actual logs are received in different format from the base library

Side note: your snippets misses setResource implementation. This can be only set while constructing the LoggerProvider. However, there is a "helper" that you can use for testing https://pkg.go.dev/go.opentelemetry.io/otel/sdk/log/logtest

If that is not the case, then let me know how can I help in documenting this behaviour as it will help others.

Sure. You can help fixing https://github.com/open-telemetry/opentelemetry-go/issues/5823.

pree-dew commented 1 month ago

@pellared Issue and PR is going to be closed now?

XSAM commented 1 month ago

@pellared I know this is a rare case, but it seems valid to have two providers with a different resource and a shared processor.

processor := log.NewBatchProcessor(logExporter)

loggerProvider1 := log.NewLoggerProvider(
    log.WithResource(res1),
    log.WithProcessor(processor),
)

loggerProvider2 := log.NewLoggerProvider(
    log.WithResource(res2),
    log.WithProcessor(processor),
)

With this, users can push different sets of resources to the same exporter.

If so, I think it is still a bug.

pellared commented 1 month ago

I agree. Reopening