kumahq / kuma

🐻 The multi-zone service mesh for containers, Kubernetes and VMs. Built with Envoy. CNCF Sandbox Project.
https://kuma.io/install
Apache License 2.0
3.67k stars 333 forks source link

MeshTrace not working on deployment without kube service #12045

Open slavogiez opened 1 week ago

slavogiez commented 1 week ago

What happened?

We configured a MeshTrace policy with a Datadog backend. This works fine and we can see traces in Datadog, but we have one case where it's not working.

We run a curl command from the source service container to another mesh service. We can see the traces from the destination sidecar, but not from the source sidecar.

It looks like it doesn't work when the source service doesn't have a kube service (no server listening).

We configured the following MeshTrace policy :

apiVersion: kuma.io/v1alpha1
kind: MeshTrace
metadata:
  labels:
    k8s.kuma.io/namespace: kong-mesh
    kuma.io/mesh: mesh01
    kuma.io/policy-role: system
  name: trace-default
  namespace: kong-mesh
spec:
  default:
    backends:
      - datadog:
          splitService: true
          url: http://trace-svc.datadog-agent.svc.cluster.local:8126
        type: Datadog
    sampling:
      client: 100
      overall: 100
      random: 100
    tags:
      - literal: xxx
        name: org_name
      - literal: yyy
        name: mesh_name
  targetRef:
    kind: Mesh

And we have the following deployment :

apiVersion: "apps/v1"
kind: "Deployment"
metadata:
  name: curl
spec:
  replicas: 1
  selector:
    matchLabels:
      app: curl
  template:
    metadata:
      annotations:
        kuma.io/sidecar-env-vars: "DD_ENV=dev;DD_SERVICE=curl-mesh-sidecar"
      labels:
        app: curl
        kuma.io/sidecar-injection: "enabled"
        kuma.io/mesh: mesh01
    spec:
      terminationGracePeriodSeconds: 5
      containers:
      - name: curl
        image: alpine/curl
        command: [ "/bin/sh", "-c", "--" ]
        args: [ "while true; do sleep 30; done;" ]

In the XDS config from the source service, we can see the datadog extension, its cluster but not the tracing configuration. In the manager ui, we can see the MeshTrace policy applied on the DP proxy.

slavogiez commented 2 days ago

Not sure what changed but now we can see expected spans from the source service. The only remaining issue is that they don't appear in the whole trace, so we're not able to find the source service in the trace.