jaegertracing / jaeger

CNCF Jaeger, a Distributed Tracing Platform
https://www.jaegertracing.io/
Apache License 2.0
20.47k stars 2.44k forks source link

Sampling configuration neglected #2197

Open rdepke opened 4 years ago

rdepke commented 4 years ago

Requirement - what kind of business use case are you trying to solve?

I want to configure sampling as described here https://www.jaegertracing.io/docs/1.17/sampling/

Problem - what in Jaeger blocks you from solving the requirement?

The configuration is neglected and I don't get any hints what I'm doing wrong.

Proposal - what do you suggest to solve the problem or improve the existing situation?

Jaeger should output currently used configuration on startup e.g. Info: using sampling configuration /path/user/provided/file.json Warn: sampling configuration /path/user/provided/file.json malformed

Something along those lines so I can see what's going on.

Any open questions to address

I use Jaeger in OpenShift to trace Java SpringBoot applications which is working great. I'll put our deployment config here so you can have a look if I do something obviously wrong.

---
parameters:
- description: The Jaeger image version to use
  displayName: Image version
  name: IMAGE_VERSION
  required: true
  value: "1.17.1"

kind: Template
apiVersion: v1
labels:
  app: jaeger
metadata:
  name: jaeger
  annotations:
    description: Jaeger Distributed Tracing Server
objects:
- kind: DeploymentConfig
  apiVersion: v1
  metadata:
    name: jaeger-collector
  spec:
    strategy:
      type: Rolling
    template:
      metadata:
        labels:
          name: jaeger-collector
      spec:
        containers:
        - name: jaeger-collector
          image: jaegertracing/jaeger-collector:${IMAGE_VERSION}
          args:
            - '--config-file=/conf/collector.yaml'
            - '--sampling.strategies-file=/sampling/sampling-strategies.json'
          ports:
          - containerPort: 14250
            protocol: TCP
          - containerPort: 14267
            protocol: TCP
          - containerPort: 14268
            protocol: TCP
          - containerPort: 14269
            protocol: TCP
          - containerPort: 9411
            protocol: TCP
          resources:
            requests:
              cpu: "100m"
              memory: "128M"
            limits:
              cpu: "1000m"
              memory: "2G"
          livenessProbe:
            tcpSocket:
              port: 14269
          readinessProbe:
            httpGet:
              path: "/"
              port: 14269
          volumeMounts:
          - name: jaeger-config-volume
            mountPath: /conf
          - name: jaeger-sampling-volume
            mountPath: /sampling
          env:
          - name: SPAN_STORAGE_TYPE
            value: elasticsearch
        volumes:
          - name: jaeger-config-volume
            secret:
              secretName: jaeger-secret
              items:
                - key: collector
                  path: collector.yaml
          - name: jaeger-sampling-volume
            configMap:
              name: jaeger-sampling-config
              items:
                - key: sampling-strategies.json
                  path: sampling-strategies.json
    replicas: 1
    triggers:
    - type: "ConfigChange"
- kind: DeploymentConfig
  apiVersion: v1
  metadata:
    name: jaeger-query
  spec:
    strategy:
      type: Rolling
    template:
      metadata:
        labels:
          name: jaeger-query
      spec:
        containers:
        - name: jaeger-query
          image: jaegertracing/jaeger-query:${IMAGE_VERSION}
          args: ["--config-file=/conf/query.yaml"]
          ports:
          - containerPort: 16686
            protocol: TCP
          - containerPort: 16687
            protocol: TCP
          resources:
            requests:
              cpu: "100m"
              memory: "128M"
            limits:
              cpu: "1000m"
              memory: "2G"
          readinessProbe:
            httpGet:
              path: "/"
              port: 16687
          volumeMounts:
          - name: jaeger-config-volume
            mountPath: /conf
          env:
          - name: SPAN_STORAGE_TYPE
            value: elasticsearch
        - name: jaeger-agent
          image: jaegertracing/jaeger-agent:${IMAGE_VERSION}
          args: ["--reporter.tchannel.host-port=jaeger-collector:14267"]
          ports:
          - containerPort: 6831
            protocol: UDP
          resources:
            requests:
              cpu: "100m"
              memory: "128M"
            limits:
              cpu: "1000m"
              memory: "2G"
        volumes:
          - name: jaeger-config-volume
            secret:
              secretName: jaeger-secret
              items:
                - key: query
                  path: query.yaml
    replicas: 1
    triggers:
    - type: "ConfigChange"

- kind: Service
  apiVersion: v1
  metadata:
    name: jaeger-collector
  spec:
    ports:
    - name: jaeger-collector-grpc
      port: 14250
      protocol: TCP
      targetPort: 14250
    - name: jaeger-collector-tchannel
      port: 14267
      protocol: TCP
      targetPort: 14267
    - name: jaeger-collector-http
      port: 14268
      protocol: TCP
      targetPort: 14268
    - name: jaeger-collector-metrics
      port: 14269
      protocol: TCP
      targetPort: 14269
    - name: jaeger-collector-zipkin
      port: 9411
      protocol: TCP
      targetPort: 9411
    selector:
      name: jaeger-collector

- apiVersion: v1
  kind: Service
  metadata:
    name: jaeger-query
  spec:
    ports:
    - name: jaeger-query-http
      port: 80
      protocol: TCP
      targetPort: 16686
    - name: jaeger-query-metrics
      port: 16687
      protocol: TCP
      targetPort: 16687
    selector:
      name: jaeger-query

- apiVersion: v1
  kind: Route
  metadata:
    name: jaeger-query
  spec:
    host: my-jaeger.host.org
    to:
      kind: Service
      name: jaeger-query
    port:
      targetPort: jaeger-query-http
    tls:
      insecureEdgeTerminationPolicy: Redirect
      termination: edge

- apiVersion: v1
  kind: Route
  metadata:
    name: jaeger-collector
  spec:
    to:
      kind: Service
      name: jaeger-collector
    port:
      targetPort: jaeger-collector-http
    tls:
      insecureEdgeTerminationPolicy: Redirect
      termination: edge

and this is the sampling configuration

{
    "service_strategies": [
    {
      "service": "testapp",
      "type": "probabilistic",
      "param": 0.0,
      "operation_strategies": [
        {
          "operation": "ok",
          "type": "probabilistic",
          "param": 0.0
        },
        {
          "operation": "query",
          "type": "probabilistic",
          "param": 0.0
        }
      ]
    }],
  "default_strategy": {
    "type": "ratelimiting",
    "param": 150,
    "operation_strategies": [
      {
        "operation": "/health",
        "type": "probabilistic",
        "param": 0.0
      },
      {
        "operation": "/metrics",
        "type": "probabilistic",
        "param": 0.0
      }
    ]
  }
}
pavolloffay commented 4 years ago

@rdepke could you please paste here the sampling configuration file?

https://github.com/jaegertracing/jaeger/blob/b99114e62edc9be5dcc9d86e7a38f6030e02f98a/plugin/sampling/strategystore/static/strategy_store.go#L70 we could consider disabling unknown fields when unmarshalling.

rdepke commented 4 years ago

Sure @pavolloffay this is my sampling configuration which is very close to the config in the documentation example.

{
    "service_strategies": [
    {
      "service": "testapp",
      "type": "probabilistic",
      "param": 0.0,
      "operation_strategies": [
        {
          "operation": "ok",
          "type": "probabilistic",
          "param": 0.0
        },
        {
          "operation": "query",
          "type": "probabilistic",
          "param": 0.0
        }
      ]
    }],
  "default_strategy": {
    "type": "ratelimiting",
    "param": 150,
    "operation_strategies": [
      {
        "operation": "/health",
        "type": "probabilistic",
        "param": 0.0
      },
      {
        "operation": "/metrics",
        "type": "probabilistic",
        "param": 0.0
      }
    ]
  }
}

I add this to the original post as well.

davidbartos commented 3 years ago

Hi Guys,

Is there any progress or solution related to this topic. I am experiencing the same issue as @rdepke, it looks like the strategies in the sampling config are not used.

Thanks