open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.95k stars 857 forks source link

Document log4j appender layout and MDC usage #5162

Closed andrewshawcare closed 2 years ago

andrewshawcare commented 2 years ago

Is your feature request related to a problem? Please describe. Logging using the log4j appender works, but doesn't seem to accept a PatternLayout which means it's not clear how to inject the trace ID or span ID.

Describe the solution you'd like A document describing how to ensure trace ID and span ID are added to the log records before being sent to loki (or another logging destination).

Describe alternatives you've considered Using promtail and loki (using a file appender).

Additional context My log4j2.xml file is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration>
    <Appenders>
        <Console name="Console" target="SYSTEM_OUT">
            <PatternLayout>
                <pattern>TraceID="%mdc{trace_id}" SpanID="%mdc{span_id}" %message%n</pattern>
            </PatternLayout>
        </Console>
        <OpenTelemetry name="OpenTelemetry">
            <PatternLayout>
                <pattern>TraceID="%mdc{trace_id}" SpanID="%mdc{span_id}" %message%n</pattern>
            </PatternLayout>
        </OpenTelemetry>
    </Appenders>
    <Loggers>
        <Root level="All">
            <AppenderRef ref="Console"/>
            <AppenderRef ref="OpenTelemetry"/>
        </Root>
    </Loggers>
</Configuration>
trask commented 2 years ago

hi @andrewshawcare! just confirming, you have configured the OTEL_LOGS_EXPORTER and are sending log records successfully to your backend, but those log records are missing traceid/spanid?

andrewshawcare commented 2 years ago

That’s correct. It transmits the log message but does not seem to use the enclosed layout configuration that adds the trace ID and span ID.

trask commented 2 years ago

does not seem to use the enclosed layout configuration

this is correct, but the appender should still be attaching the traceid/spanid in the standard location on the LogRecord that is transmitted over OTLP

andrewshawcare commented 2 years ago

Hmm, what do you mean by, "standard location?" I was under the impression that you need to indicate somehow with log4j that you want the MDC mapped to the log records.

For what it's worth, I also found this option... is this relevant to adding the MDC?

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/log4j/log4j-appender-2.16/library/src/main/java/io/opentelemetry/instrumentation/log4j/appender/v2_16/internal/LogEventMapper.java#L57

trask commented 2 years ago

what do you mean by, "standard location?"

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/data-model.md#trace-context-fields

it's this line that captures the traceid/spanid:

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/4f29770f73c6225b6698be88bbcb9ad5c832fa01/instrumentation/log4j/log4j-appender-2.16/library/src/main/java/io/opentelemetry/instrumentation/log4j/appender/v2_16/internal/LogEventMapper.java#L111

if you use the stdout logging exporter:

OTEL_LOGS_EXPORTER=logging

you should be able to see the traceid/spanid as part of the captured log record

andrewshawcare commented 2 years ago

Hmm, that has an odd effect. The log records I get in the console with (what I had):

OTEL_LOGS_EXPORTER=otlp

are:

TraceID="499245b02a34ac8c3df0509c62cabef5" SpanID="5fe17d25164df6f9" eventType=depthMeasurement depth=5106 distance=1736 time=1642626183043

with OpenTelemetry collector debug output of:

LogRecord #450
Timestamp: 2022-01-19 21:07:07.631145279 +0000 UTC
Severity: INFO
ShortName: 
Body: eventType=depthMeasurement depth=5875 distance=1986 time=1642626413188
Trace ID: 944df3bbd95d8d40204ecb9426262034
Span ID: d1709fbe205c6de6
Flags: 1

When I change that to:

OTEL_LOGS_EXPORTER=logging

I get this output in the console (note this is over two lines):

2022-01-19T21:11:01.204Z INFO 'eventType=depthMeasurement depth=5878 distance=1998 time=1642626658680' : d98b685badb586ab2208c17edb38d256 3a29377eeb4a8877 [libraryInfo: com.andrewshawcare.advent_of_code._2021._1.DepthMeasurement:] {}
TraceID="d98b685badb586ab2208c17edb38d256" SpanID="3a29377eeb4a8877" eventType=depthMeasurement depth=5878 distance=1998 time=1642626658680

with no logging from (and presumably no export to) the OpenTelemetry collector.

andrewshawcare commented 2 years ago

From the debug output, it does look like the Trace ID and Span ID are in the log record, but I'm not sure how to actually query those (as labels or fields) since it looks like grafana can't find the Trace ID or Span ID to query on.

Any thoughts?

andrewshawcare commented 2 years ago

This is what grafana gives me (with promtail I had the Trace ID and Span ID as fields I could correlate with tempo and query on from tempo):

Screen Shot 2022-01-19 at 4 15 04 PM
andrewshawcare commented 2 years ago

Do I need to set this property?

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/log4j/log4j-appender-2.16/library/src/main/java/io/opentelemetry/instrumentation/log4j/appender/v2_16/internal/LogEventMapper.java#L57

It looks like I need to set that to * or ["TraceID", "SpanID"] somehow so that they are included in the context (and ultimately used in the line of source code you referenced)?

andrewshawcare commented 2 years ago

Nevermind, that doesn't have any result on what ends up in grafana.

andrewshawcare commented 2 years ago

So, I guess my question is... how do I get the trace ID and span ID from the "standard location" to a place where grafana can query them (e.g. with the file appender, the pattern layout applied them to the log record)?

trask commented 2 years ago

I think this depends on the backend ingestion service(s) you are using

andrewshawcare commented 2 years ago

Java application => OpenTelemetry Collector => Loki => Grafana

andrewshawcare commented 2 years ago

It appears the collector is aware of the trace ID and span ID, just unsure how to map that to the export so Loki is aware of them, too.

trask commented 2 years ago

it doesn't sound like this is an issue with the Java appender, since you can see the traceid/spanid going out in the LogRecord?

andrewshawcare commented 2 years ago

I suppose not, and can close this issue. Just odd that there seems to be a static layout to the logs and not sure how to get them into any downstream systems.

Is there going to be support for layouts in the future so applications can own the format and content of the log records?

trask commented 2 years ago

I believe there are two expected routes for logs:

andrewshawcare commented 2 years ago

Right, doing both of those (how the trace/span IDs are in the Console and how the OTLP exporter sends them to the collector via the appender).

The problem I have is: How do I get them into a grafana query? I have no real reference to them anymore, as they are no longer:

Is this not a popular use case? That was why my original suggestion was an example.

It feels like, most of the time, you'd want to get the logs into a place like Grafana with Trace ID intact. Not sure how getting it into the context that the collector can see helps unless there's some way to map that to the downstream log record.

Does that make sense? Should I open a ticket or look for some docs in the collector repo?

trask commented 2 years ago

ya, it sounds like a question for the component you are using to get logs from the collector into grafana

andrewshawcare commented 2 years ago

Thanks for your help!

andrewshawcare commented 2 years ago

@trask FYI Just updated the collector docker image to v0.42.0 and it's working now!

andrewshawcare commented 2 years ago
Screen Shot 2022-01-21 at 5 48 35 PM