open-telemetry / opentelemetry-python-contrib

OpenTelemetry instrumentation for Python modules
https://opentelemetry.io
Apache License 2.0
703 stars 588 forks source link

HTTPX instrumentation - document how to log request payload and response on response_hook #2556

Open staticdev opened 4 months ago

staticdev commented 4 months ago

Is your feature request related to a problem? After reading the httpx instrumentation docs it is clear I need a response_hook and how to use it. But it has no example how to really use this with the span, request, response attributes to log the payload from request and response.

After digging into __init__() I see both ResponseInfo and RequestInfo have a stream property that could potentially be the solution, but each of them are a different type:

class RequestInfo(typing.NamedTuple):
....
typing.Optional[
        typing.Union[httpx.SyncByteStream, httpx.AsyncByteStream]
    ]

class ResponseInfo(typing.NamedTuple):
...
    stream: typing.Iterable[bytes]

In those cases it is not very clear an easy way to log the information from the payload.

Describe the solution you'd like A simple example of response_hook in the docs with logging of payload of request and response.

rok commented 2 months ago

We'd like to claim this.

rok commented 2 months ago

@staticdev it seems the issue here is that response stream returned by httpx is an iterator and if the response hook reads it, it will be exhausted before your application will get to read it (see comment in response_hook below). However as suggested here and here you could create a LogResponse class that prints each chunk as your application consumes it, so you get at least some logging.

import httpx
from opentelemetry.instrumentation.httpx import SyncOpenTelemetryTransport, AsyncOpenTelemetryTransport

class LogResponse(httpx.Response):
    def iter_bytes(self, *args, **kwargs):
        for chunk in super().iter_bytes(*args, **kwargs):
            print(chunk)
            yield chunk

class LogTransport(httpx.BaseTransport):
    def __init__(self, transport: httpx.BaseTransport):
        self.transport = transport

    def handle_request(self, request: httpx.Request) -> httpx.Response:
        response = self.transport.handle_request(request)

        return LogResponse(
            status_code=response.status_code,
            headers=response.headers,
            stream=response.stream,
            extensions=response.extensions,
        )

def response_hook(span, request, response):
    status_code, headers, stream, extensions = response
    span.set_attribute("http.status_code", status_code)

    # This will add response to span but then won't return because
    # iterator would have been exhausted.
    # r = b"".join(stream)
    # print(r)
    # span.set_attribute("http.response.body", r)

transport = LogTransport(httpx.HTTPTransport())
telemetry_transport = SyncOpenTelemetryTransport(transport, response_hook=response_hook)

url = "https://google.com/"

with httpx.Client(transport=telemetry_transport) as client:
    response = client.get(url)

Perhaps there is a way to also pass span to LogResponse and collect stream chunks there, but I don't know enough about OpenTelemetry internals at this point to say if it's feasible and/or advisable :).