open-telemetry / opentelemetry-operator

Kubernetes Operator for OpenTelemetry Collector
Apache License 2.0
1.13k stars 397 forks source link

Auto instrumentation python service - trace data doesn't arrive in Endpoint #924

Open tybalex opened 2 years ago

tybalex commented 2 years ago

I am trying out the auto-instrumentation feature of Open Telemetry with an example demo application. I tried to applied auto-instrumentation to the 2 python services recommendation and email, as well as 2 nodejs services payment and currency. However I was only able to find trace data from nodejs services in my endpoint (Opensearch), but not python services'.

My environment k3s-1.23 example demo application(https://github.com/GoogleCloudPlatform/microservices-demo) python 3.7 for the python services

manifest of the python service:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: recommendationservice
spec:
  selector:
    matchLabels:
      app: recommendationservice
  template:
    metadata:
      labels:
        app: recommendationservice
      annotations:
        sidecar.opentelemetry.io/inject: "true"
        instrumentation.opentelemetry.io/inject-python: "true"
    spec:
      serviceAccountName: default
      terminationGracePeriodSeconds: 5
      containers:
      - name: server
        image: tybalex/recommendationservice:dev
        ports:
        - containerPort: 8080
        readinessProbe:
          periodSeconds: 5
          exec:
            command: ["/bin/grpc_health_probe", "-addr=:8080"]
        livenessProbe:
          periodSeconds: 5
          exec:
            command: ["/bin/grpc_health_probe", "-addr=:8080"]
        env:
        - name: PORT
          value: "8080"
        - name: PRODUCT_CATALOG_SERVICE_ADDR
          value: "productcatalogservice:3550"
        resources:
          requests:
            cpu: 100m
            memory: 220Mi
          limits:
            cpu: 200m
            memory: 450Mi
---
apiVersion: v1
kind: Service
metadata:
  name: recommendationservice
spec:
  type: ClusterIP
  selector:
    app: recommendationservice
  ports:
  - name: grpc
    port: 8080
    targetPort: 8080

Example logs in the recommendation service pod:

{"timestamp": 1654892491.8406446, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['0PUK6V6EV0', '2ZYFJ3GM2N', '6E92ZMYYFZ', 'L9ECAV7KIM', '66VCHSJNUP']", "otelSpanID": "7fd7aae81b7615e0", "otelTraceID": "8dc9a11590672e91fda208974c4bc6e7", "otelServiceName": "recommendationservice"}
{"timestamp": 1654892491.8581243, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['6E92ZMYYFZ', '9SIQT8TOJO', '0PUK6V6EV0', 'L9ECAV7KIM', 'OLJCESPC7Z']", "otelSpanID": "87451d7577e6cfee", "otelTraceID": "33610c8624ecaab6f54b72d027f0013b", "otelServiceName": "recommendationservice"}
{"timestamp": 1654892492.5934675, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['L9ECAV7KIM', '1YMWWN1N4O', '2ZYFJ3GM2N', '6E92ZMYYFZ', '0PUK6V6EV0']", "otelSpanID": "e986f7291edb5335", "otelTraceID": "6ae0a6f318cb5e186d0de7984fca5c19", "otelServiceName": "recommendationservice"}
{"timestamp": 1654892492.6086466, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['1YMWWN1N4O', '0PUK6V6EV0', '2ZYFJ3GM2N', 'OLJCESPC7Z', '66VCHSJNUP']", "otelSpanID": "875c0e8c9d238aa4", "otelTraceID": "165553d3bbcfd3017777d92ced997ad3", "otelServiceName": "recommendationservice"}

they do have Otel traceID and ServiceName assigned.

What is the expected behavior? trace data available in the Opensearch index otel-v1-apm-span*, as I do see the trace data from nodejs services.

What is the actual behavior? trace data from python services are missing.

additional information here's the instrumentation :

apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
  name: my-instrumentation
spec:
  exporter:
    endpoint: http://opentelemetry-collector:4317
  propagators:
    - tracecontext
    - baggage
    - b3
  sampler:
    type: parentbased_traceidratio
    argument: "0.25"
  java:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest
  nodejs:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest
  python:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest
pavolloffay commented 2 years ago

Could you please provide full logs from the python service - recommendation?

Could you please set sampler.argument to 1 it will ensure that all requests are sampled.

tybalex commented 2 years ago
Defaulted container "server" out of: server, opentelemetry-auto-instrumentation (init)
{"timestamp": 1655156183.17932, "severity": "INFO", "name": "recommendationservice-server", "message": "initializing recommendationservice", "otelSpanID": "0", "otelTraceID": "0", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156183.18063, "severity": "INFO", "name": "recommendationservice-server", "message": "product catalog address: productcatalogservice:3550", "otelSpanID": "0", "otelTraceID": "0", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156183.2722766, "severity": "INFO", "name": "recommendationservice-server", "message": "listening on port: 8080", "otelSpanID": "0", "otelTraceID": "0", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156206.1735613, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['OLJCESPC7Z', 'LS4PSXUNUM', '2ZYFJ3GM2N', '0PUK6V6EV0', '66VCHSJNUP']", "otelSpanID": "e8aeede92c87a906", "otelTraceID": "722f18d4050eb877a2c5cd4ca7a08c75", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156206.3777, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['0PUK6V6EV0', 'LS4PSXUNUM', 'OLJCESPC7Z', '2ZYFJ3GM2N', 'L9ECAV7KIM']", "otelSpanID": "ddfdde322896a4a8", "otelTraceID": "02bd2671ba1bc271b71620c14e2eecc0", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156206.73943, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['1YMWWN1N4O', '66VCHSJNUP', '2ZYFJ3GM2N', 'LS4PSXUNUM', 'OLJCESPC7Z']", "otelSpanID": "e236d69515b74c0a", "otelTraceID": "8c894b1b232df41808bf7f31bb78899f", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156207.878156, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['66VCHSJNUP', 'L9ECAV7KIM', '0PUK6V6EV0', '1YMWWN1N4O', '6E92ZMYYFZ']", "otelSpanID": "ec13f5586e2290e4", "otelTraceID": "713de4294388f568b52a05acda180edf", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156208.01793, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['L9ECAV7KIM', '1YMWWN1N4O', '2ZYFJ3GM2N', '66VCHSJNUP', 'LS4PSXUNUM']", "otelSpanID": "6ddbc3d050dc2ae6", "otelTraceID": "6621692d9a14e9dd95365d58ad58bf38", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156208.069012, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['2ZYFJ3GM2N', '0PUK6V6EV0', '66VCHSJNUP', '9SIQT8TOJO', 'LS4PSXUNUM']", "otelSpanID": "5565814340d685b4", "otelTraceID": "6bfc88cccbac4079b29f8c9e39642938", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156208.0824213, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['L9ECAV7KIM', 'LS4PSXUNUM', 'OLJCESPC7Z', '6E92ZMYYFZ', '1YMWWN1N4O']", "otelSpanID": "5471451ad42b85ea", "otelTraceID": "93191608976dea6fbe9fa710dd0fedd8", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156208.9408882, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['LS4PSXUNUM', 'L9ECAV7KIM', '2ZYFJ3GM2N', '66VCHSJNUP', '6E92ZMYYFZ']", "otelSpanID": "3e8b5d923298e111", "otelTraceID": "13379705cacc8b1322e345698710a48a", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156209.3045213, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['OLJCESPC7Z', 'L9ECAV7KIM', '0PUK6V6EV0', 'LS4PSXUNUM', '1YMWWN1N4O']", "otelSpanID": "46b7538576413fe7", "otelTraceID": "f0312d506a761fc57d130019b86eb44a", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156209.4060457, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['OLJCESPC7Z', '66VCHSJNUP', '0PUK6V6EV0', '1YMWWN1N4O', 'L9ECAV7KIM']", "otelSpanID": "4bd56bdcdacf3769", "otelTraceID": "bfb7bd67c451da9a8a64dd18a567940c", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156210.417171, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['9SIQT8TOJO', 'LS4PSXUNUM', '6E92ZMYYFZ', '0PUK6V6EV0', 'OLJCESPC7Z']", "otelSpanID": "2bd01127c7f294b0", "otelTraceID": "1b81e317aaeae2a6f010b78d3c50c3d9", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156210.8531733, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['1YMWWN1N4O', '0PUK6V6EV0', '2ZYFJ3GM2N', '66VCHSJNUP', 'L9ECAV7KIM']", "otelSpanID": "a793944b5adcf661", "otelTraceID": "aa84a8c0b01a9cf1f913ac5d90362bdc", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156210.9573407, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['9SIQT8TOJO', '6E92ZMYYFZ', 'OLJCESPC7Z', '0PUK6V6EV0', '1YMWWN1N4O']", "otelSpanID": "8b61ef86d8a22809", "otelTraceID": "47110768656edcd60cd702c596d4f354", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156211.98219, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['1YMWWN1N4O', '66VCHSJNUP', '9SIQT8TOJO', 'LS4PSXUNUM', '6E92ZMYYFZ']", "otelSpanID": "192fff392f847ee0", "otelTraceID": "25e3bb5e7f4dd622fd851d63c1e45a01", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156213.0456135, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['66VCHSJNUP', '9SIQT8TOJO', '6E92ZMYYFZ', 'L9ECAV7KIM', '0PUK6V6EV0']", "otelSpanID": "11e291183c7c0f88", "otelTraceID": "15258e5b0ccaa28fe874028fa8bc44eb", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156213.3066819, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['0PUK6V6EV0', 'L9ECAV7KIM', '6E92ZMYYFZ', 'LS4PSXUNUM', '1YMWWN1N4O']", "otelSpanID": "9ff11e4ab30f4a99", "otelTraceID": "faff7cb0de133eca7f20e78175ccc1f2", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156213.5261924, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['OLJCESPC7Z', '66VCHSJNUP', '2ZYFJ3GM2N', 'LS4PSXUNUM', '0PUK6V6EV0']", "otelSpanID": "50a22142d32b6343", "otelTraceID": "3dc944cfe08a121fe8a9d9e92f896a88", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156213.6419687, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['66VCHSJNUP', '2ZYFJ3GM2N', 'OLJCESPC7Z', 'LS4PSXUNUM', '6E92ZMYYFZ']", "otelSpanID": "b8624a24dc0d2a18", "otelTraceID": "65a0d1b4e006e5ddb835e1abb76c49c0", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156213.6419687, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['66VCHSJNUP', '2ZYFJ3GM2N', 'OLJCESPC7Z', 'LS4PSXUNUM', '6E92ZMYYFZ']", "otelSpanID": "b8624a24dc0d2a18", "otelTraceID": "65a0d1b4e006e5ddb835e1abb76c49c0", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156214.5730157, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['1YMWWN1N4O', '6E92ZMYYFZ', 'L9ECAV7KIM', '9SIQT8TOJO', 'OLJCESPC7Z']", "otelSpanID": "fb90b9d3ae41547e", "otelTraceID": "012f4c7349ae8bb2f02f4e7b05725280", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156216.9405031, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['66VCHSJNUP', '9SIQT8TOJO', 'L9ECAV7KIM', 'LS4PSXUNUM', 'OLJCESPC7Z']", "otelSpanID": "49125048767c311b", "otelTraceID": "fa987eb238436c01dc00b914ffa23b76", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156217.0166621, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['LS4PSXUNUM', 'L9ECAV7KIM', '9SIQT8TOJO', '1YMWWN1N4O', 'OLJCESPC7Z']", "otelSpanID": "0283e0cbde05fec4", "otelTraceID": "04c66e248cac374805ddc71ecc1df7b1", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156217.4466836, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['1YMWWN1N4O', 'LS4PSXUNUM', '66VCHSJNUP', 'OLJCESPC7Z', '0PUK6V6EV0']", "otelSpanID": "fb2d4bfd162a483a", "otelTraceID": "63440c9ba757284bf165c5471a8ec93d", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156218.088779, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['6E92ZMYYFZ', 'OLJCESPC7Z', '66VCHSJNUP', 'LS4PSXUNUM', '9SIQT8TOJO']", "otelSpanID": "fbf255032482bbb7", "otelTraceID": "20df35561a8215e7a4222c44fc250ae4", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156218.9358883, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['L9ECAV7KIM', 'LS4PSXUNUM', '66VCHSJNUP', '2ZYFJ3GM2N', 'OLJCESPC7Z']", "otelSpanID": "c20760663f0a75ea", "otelTraceID": "b8f49690a4e034c4ae0568d5f443f013", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156219.0534446, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['1YMWWN1N4O', '66VCHSJNUP', 'LS4PSXUNUM', 'L9ECAV7KIM', '9SIQT8TOJO']", "otelSpanID": "9a00a0ed7fa813ee", "otelTraceID": "e0c1e7b2126ff3538de20a0608833eaf", "otelServiceName": "recommendationservice"}
{"timestamp": 1655156219.1580741, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['OLJCESPC7Z', 'LS4PSXUNUM', '6E92ZMYYFZ', 'L9ECAV7KIM', '2ZYFJ3GM2N']", "otelSpanID": "1c2f5d8896ce136d", "otelTraceID": "b1e88c71a221c446c384fd370b930d92", "otelServiceName": "recommendationservice"}

well there's only 1 format of log in this pod except this fine line. and there's no log for the init container opentelemetry-auto-instrumentation

pavolloffay commented 2 years ago

@tybalex did you try to set the sampling to 1 and then create some requests against the app?

After the setting is applied the pod needs to be restarted

tybalex commented 2 years ago

@tybalex did you try to set the sampling to 1 and then create some requests against the app?

After the setting is applied the pod needs to be restarted

Yes I did set the sampling to 1 in the instrumentation like this:

apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
  name: my-instrumentation
spec:
  exporter:
    endpoint: http://opentelemetry-collector:4317
  propagators:
    - tracecontext
    - baggage
    - b3
  sampler:
    type: parentbased_traceidratio
    argument: "1"
  java:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest
  nodejs:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest
  python:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest

and then i've re-deployed my services.

mat-rumian commented 2 years ago

@tybalex please set for Python instrumentation different exporter endpoint as Python instrumentation is using OTLP HTTP exporter by default. Please use eg. http://collector-hostname:4318. You can set it only for Python modifying your Instrumentation CR e.g.

apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
  name: my-instrumentation
spec:
  exporter:
    endpoint: http://opentelemetry-collector:4317
  propagators:
    - tracecontext
    - baggage
    - b3
  sampler:
    type: parentbased_traceidratio
    argument: "1"
  java:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest
  nodejs:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest
  python:
    env:
      - name: OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
        value: http://opentelemetry-collector:4318/v1/traces
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest
tybalex commented 2 years ago

This worked! thank you @mat-rumian . Just curious why python is treated differently from the other 2...

pavolloffay commented 2 years ago

we should take some action item here and fix the problem for all users.

This should be either documented or the operator can set env var OTEL_EXPORTER_OTLP_TRACES_ENDPOINT for python.

mat-rumian commented 2 years ago

@pavolloffay based on that https://github.com/open-telemetry/opentelemetry-specification/issues/1885#issuecomment-934435972 I think we should set our instrumentations exporters to OTLP HTTP protocol.

pavolloffay commented 2 years ago

From the linked issues it seems like there is no strong opinion what is the default exporter and SDKs might choose one or the other depending on circumstances.

To me it seems like that we should follow what the language SDKs do and choose the right default based on the language.

aabmass commented 1 year ago

To me it seems like that we should follow what the language SDKs do and choose the right default based on the language.

+1, @srikanthccv any thoughts on this? I'm happy to send a PR for this if we agree on the approach.

srikanthccv commented 1 year ago

We have been using OTLP/gRPC as a default exporter in the auto instrumentation package https://github.com/open-telemetry/opentelemetry-python-contrib/blob/51ba801bfda31c3d57902d9f9df938cee1236eb8/opentelemetry-distro/src/opentelemetry/distro/__init__.py#L37-L38. This was the choice of default partly because we didn't have HTTP exporters for OTLP. Even outside this operator deployment, I have seen people run into weird issues because gRPC sometimes doesn't work well with their env. Personally, I am in favor of using OTLP/HTTP + Protobuf. With that change people who already run gRPC will face issues because the env like endpoint needs to be updated.

aabmass commented 1 year ago

For this issue, I wonder if instead of overriding OTEL_EXPORTER_OTLP_TRACES_ENDPOINT to port 4318 in the operator, should the Python OTLP exporters set the default port based on whether the protocol is gRPC or HTTP?

See here: image

srikanthccv commented 1 year ago

Don't we do that already today? when the OTEL_TRACE_EXPORTER is set to gRPC it uses the default http://localhost:4317 for the endpoint and http://localhost:4318 for HTTP.

And the priority order OTEL_EXPORTER_OTLP_TRACES_ENDPOINT > OTEL_EXPORTER_OTLP_ENDPOINT

aabmass commented 1 year ago

You're right it needs to be fixed in the operator 👍 sorry for the confusion.

@pavolloffay @mat-rumian would you accept a PR setting the port to 4318 for Python?

TylerHelmuth commented 1 year ago

@aabmass what did you have in mind for setting the port to 4318? If the user set spec.exporter.endpoint to any value with 4317 in it would we replace it with 4318?