open-telemetry / opentelemetry-python

OpenTelemetry Python API and SDK
https://opentelemetry.io
Apache License 2.0
1.8k stars 626 forks source link

Service name not changing automatic instrumentation django #3409

Open amirform opened 1 year ago

amirform commented 1 year ago

Describe your environment: hey everyone, I'm working with python 3.11, django 4.2.1 and instrumentation 0.40b0. I'm running my django app locally and otel collector (through helm), data prepper (yaml files) and amazon opensearch (though helm) on k8s. I want to export traces from the django app to the otel collector and so on using automatic instrumentation when a different service is used (like mysql), and have the output look like the django app started the first span, and then a different service was used (mysql). I have a simple django app that uses mysql to fetch data. The traceparent id gets caught by the instrumentation and 3 spans are shown in the trace log in opensearch, but the service name is the same.

Steps to reproduce run the django app locally, and port-forward the k8s opentelemetry collector service (or just run everything locally with docker) django-app:


# manage.py

#!/usr/bin/env python
import os
import sys
import mysql.connector
from opentelemetry.instrumentation.django import DjangoInstrumentor
from opentelemetry.instrumentation.mysql import MySQLInstrumentor
from opentelemetry.instrumentation.requests import RequestsInstrumentor

if __name__ == '__main__':
    os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'mysite.settings')

    DjangoInstrumentor().instrument(is_sql_commentor_enabled=True,)
    MySQLInstrumentor().instrument()
    RequestsInstrumentor().instrument()

    try:
        from django.core.management import execute_from_command_line
    except ImportError as exc:
        raise ImportError(
            "Couldn't import Django. Are you sure it's installed and "
            "available on your PYTHONPATH environment variable? Did you "
            "forget to activate a virtual environment?"
        ) from exc
    execute_from_command_line(sys.argv)
# views.py

from django.http import HttpResponse, HttpResponseRedirect
from opentelemetry.instrumentation.dbapi import trace_integration

import MySQLdb
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import (
    BatchSpanProcessor,
    ConsoleSpanExporter,
)
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource

from .models import <SOME_MODEL>

resource_django_app = Resource(attributes={
    "service.name": "django-app"
})

trace.set_tracer_provider(TracerProvider(resource=resource_django_app))
trace.get_tracer_provider().add_span_processor(
    BatchSpanProcessor(OTLPSpanExporter(endpoint='http://localhost:4317'))
)
trace.get_tracer_provider().add_span_processor(
    BatchSpanProcessor(ConsoleSpanExporter())
)

trace_integration(MySQLdb, "connect", "mysql")

def index(request):
    return HttpResponseRedirect("/redirected")

def redirected(request):
    models = "\n".join([f'{model.name}' for model in <SOME_MODEL>.objects.all()])
    return HttpResponse(models)
# settings.py

# database section
DATABASES = {
    "default": {
        "ENGINE": "django.db.backends.mysql",
        "NAME": "<YOUR_DB_NAME>",
        "HOST": "127.0.0.1",
        "PORT": "3306",
        "USER": "<YOUR_USERNAME>",
        "PASSWORD": "<YOUR_PASSWORD>",
        "OPTIONS": {"charset": "utf8mb4"},
    }
}

opentelemetry collector config:

config:
  exporters:
    otlp/data-prepper:
      endpoint: <data-prepper-endpoint>:21890
      tls:
        insecure: true
  receivers:
    otlp:
      protocols:
        http:
          endpoint: ${env.MY_POD_IP}:4318
        grcp:
          endpoint: ${env.MY_POD_IP}:4317
  service:
    pipelines:
      traces:
        receivers:
          - otlp
        exporters:
          - otlp/data-prepper
          - logging

What is the expected behavior? This image is from the opensearch documentation. My example should have had a mysql service show up in the chart like the purple one here: image

What is the actual behavior? The db actions (SELECT, SET etc...) are being separated into different spans but the service is the same. trace-log1(1)

These are the traces sent out of the django app. Clearly the service.name resource attribute that I hard-coded is propagating. Is there any way to make it change according to the service that runs the command?

{
    "name": "SELECT",
    "context": {
      "trace_id": "0x5b8aa5a2d2c872e8321cf37308d69df2",
      "span_id": "0x5fb397be34d26b51",
      "trace_state": "[]"
    },
    "kind": "SpanKing.CLIENT",
    "parent_id": "0x051581bf3cb55c13",
    "start_time": "2022-04-29T18:52:58.114304Z",
    "end_time": "2022-04-29T22:52:58.114561Z",
    "status": {
      "status_code": "UNSET"
    },
    "attributes": {
        "db.system": "mysql",
        "db.name": "",
        "db.statement": "SELECT `some-table`.`some-attribute`...",
        "db.user": "<YOUR USER>",
        "net.peer.name": "127.0.0.1",
        "net.peer.port": 3306,
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "service.name": "django-app",
        },
        "schema_url": ""
    }
}
{
    "name": "GET redirected",
    "context": {
      "trace_id": "0x5b8aa5a2d2c872e8321cf37308d69df2",
      "span_id": "0x051581bf3cb55c13",
      "trace_state": "[]"
    },
    "kind": "SpanKing.CLIENT",
    "parent_id": null,
    "start_time": "2022-04-29T18:52:58.114304Z",
    "end_time": "2022-04-29T22:52:58.114561Z",
    "status": {
      "status_code": "UNSET"
    },
    "attributes": {
        "http.method": "GET",
        "http.server_name": "kubernetes.docker.internal",
        "http.scheme": "http",
        "http.host.port": 8000,
        "http.host": "127.0.0.1:8000",
        "http.url": "http://127.0.0.1:8000/redirected",
        "net.peer.ip": "127.0.0.1",
        "http.user_agent": "chrome info",
        "http.flavor": "1.1",
        "http.route": "redirected",
        "http.status_code": 200,
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "service.name": "django-app",
        },
        "schema_url": ""
    }
}

Thank you so much! Hope this is clear. If any more information is needed let me know.

lzchen commented 1 year ago

Resource and in extension service.name was designed so that you would use the same service to track a single trace in your application. So a single TracerProvider with a single resource is the intended usage. Is it possible for you to use a different indicator of call instead of service.name?

If not, you can use a custom span processor to modify the service.name of the resource of the span data that is exported for the services that you want to differentiate from the main service.

lzchen commented 1 year ago

@amirform gentle ping on this

amirform commented 1 year ago

Thank you for replying @lzchen! I believe that the service.name indicator is the one that opensearch looks at to indicate different services, so I don't think that changing it. I want the calls to the db from my django app to register as a different service altogether. How was the example in the opensearch documentation configured so the sql is registered as a different service? Is the mysql server also instrumented by an opentelemetry package?

Also, I tried to test the connection between two django apps that communicate with each other and are each instrumented with the automatic django instrumentation (with a different service.name). Without a specific manual setting of the trace provider on each function, the service name stayed the same even though the django application was different.

The problem I'm seeing is that every request to a service that is outside the main django app is still considered the same service because it originated from the main django app.

What would you change in the span processor that could help with that?

lzchen commented 1 year ago

How was the example in the opensearch documentation configured so the sql is registered as a different service?

I am not sure if the examples in OpenSearch were populated by OpenTelemetry libraries.

Without a specific manual setting of the trace provider on each function

service.name is tied to a specific TracerProvider instance (1:1). If you use the same TracerProvider for different applications they will have the same service.name.

What would you change in the span processor that could help with that?

from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.resource import Resource

class ServiceNameProcessor(BatchSpanProcessor):

    def on_start(self, span, parent_context):
        prev_resource = span._resource
        if span._instrumentation_scope.name == "opentelemetry.instrumentation.requests":
              new_resource = Resource.create({"service_name": "requests_service"})
              span._resource = prev_resource.merge(new_resource)

Something like this, where you conditionally check for the instrumentation that produced the span and then modify the service name on the span to be exported.

Keep in mind that this is extremely hacky and is not the intended usage of service.name. If you only want to enable this use case the above code would work but it may limit other potential distributed tracing features or functionality in the future.