open-telemetry / opentelemetry-python

OpenTelemetry Python API and SDK
https://opentelemetry.io
Apache License 2.0
1.67k stars 571 forks source link

Split serialization and export of spans into separate functions for http trace exporter #3826

Closed kchoudhu closed 3 months ago

kchoudhu commented 3 months ago

Description

Small refactor of OTLPSpanExporter that splits serialization of spans and their export into separate functions.

There is no functional change to the SpanExporter. I'm submitting this PR because it would make the export method of a subprocess-spawning OTLPExporter that I'm working on slightly less duplicative.

Type of change

Please delete options that are not relevant.

This is a refactor that doesn't add or remove any functionality.

How Has This Been Tested?

Verified in production instance that OTLPSpanExporter continues to work as expected. Ran test suite.

Does This PR Require a Contrib Repo Change?

Answer the following question based on these examples of changes that would require a Contrib Repo Change:

Checklist:

linux-foundation-easycla[bot] commented 3 months ago

CLA Signed

The committers listed above are authorized under a signed CLA.

xrmx commented 3 months ago

@kchoudhu what are you going to reuse / replace?

kchoudhu commented 3 months ago

@kchoudhu what are you going to reuse / replace?

Hi: I'm writing a span processor that isolates the send of the spans in a child subprocess, but the serialization happens in the parent process. Roughly speaking the new export logic currently looks something like this:

def export(self, spans) -> SpanExportResult:
        """
        We are overriding the base OTLPSpanExporter in such a way
        that bifurcates the serialization of the data from its
        transmission to the OTEL collector.
        """

        # After the call to Shutdown subsequent calls to Export are
        # not allowed and should return a Failure result.
        if self._shutdown:
            return SpanExportResult.FAILURE

        if self.export_to_sidecar is True:
            # Open a socket and serialize the data into bytes
            client = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
            try:
                client.connect(self.socket)
            except FileNotFoundError:
                print(f"client: sleeping 2 seconds until telemetry exporter available")
                time.sleep(2)

            # Serialize and send spans
            serialized_data = encode_spans(spans).SerializeToString()

            header = struct.pack("!Q", len(serialized_data))
            message = header + serialized_data
            client.sendall(message)

            # Close the connection
            client.close()

            # If we've made it so far, great!
            return SpanExportResult.SUCCESS
        else:
            # We are in a subprocess, and the spans are already serialized
            for delay in _create_exp_backoff_generator(
                max_value=self._MAX_RETRY_TIMEOUT
            ):

                if delay == self._MAX_RETRY_TIMEOUT:
                    return SpanExportResult.FAILURE

                resp = self._export(serialized_data)
                # pylint: disable=no-else-return
                if resp.status_code in (200, 202):
                    return SpanExportResult.SUCCESS
                elif self._retryable(resp):
                    continue
                else:
                    return SpanExportResult.FAILURE

            return super().export(spans)

With this tiny refactor, this is reduced to:

def export(self, spans) -> SpanExportResult:
        """
        We are overriding the base OTLPSpanExporter in such a way
        that bifurcates the serialization of the data from its
        transmission to the OTEL collector.
        """

        # After the call to Shutdown subsequent calls to Export are
        # not allowed and should return a Failure result.
        if self._shutdown:
            return SpanExportResult.FAILURE

        if self.export_to_sidecar is True:
            # Open a socket and serialize the data into bytes
            client = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
            try:
                client.connect(self.socket)
            except FileNotFoundError:
                print(f"client: sleeping 2 seconds until telemetry exporter available")
                time.sleep(2)

            # Serialize and send spans
            serialized_data = self._serialize_spans(spans)

            header = struct.pack("!Q", len(serialized_data))
            message = header + serialized_data
            client.sendall(message)

            # Close the connection
            client.close()

            # If we've made it so far, great!
            return SpanExportResult.SUCCESS
        else:

            return self._export_serialized_spans(serialized_data)

All of this is wildly rough right now, but if there is interest I am happy to contribute the subprocess-based sidecar exporter back to the commons.

srikanthccv commented 3 months ago

We don't provide any backwards compatibility guarantees for the internal methods. I hope you are aware of that.

kchoudhu commented 3 months ago

Yes, of course.

srikanthccv commented 3 months ago

We have a contrib repo where components that are contributed by the community live https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/exporter. The original contributor of the component will remain the main person to maintain it. You can perhaps open an issue there with more detail and see how it goes.

kchoudhu commented 3 months ago

Thanks -- I'll go ahead and load test the current implementation for a bit in production before starting a conversation over there.