jaegertracing / jaeger-operator

Jaeger Operator for Kubernetes simplifies deploying and running Jaeger on Kubernetes.
https://www.jaegertracing.io/docs/latest/operator/
Apache License 2.0
1.02k stars 345 forks source link

Agent can't transfer data to controller!!! #1423

Open PuppetA17 opened 3 years ago

PuppetA17 commented 3 years ago

Describe the bug jaeger agent report error:

{"level":"info","ts":1617467900.6410925,"caller":"grpc@v1.29.1/resolver_conn_wrapper.go:143","msg":"ccResolverWrapper: sending update to cc: {[] <nil> <nil>}","system":"grpc","grpc_log":true}
{"level":"error","ts":1617467917.929262,"caller":"grpc/reporter.go:74","msg":"Could not send spans over gRPC","error":"rpc error: code = Unavailable desc = last resolver error: produced zero addresses","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/reporter/grpc.(*Reporter).send\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/reporter/grpc/reporter.go:74\ngithub.com/jaegertracing/jaeger/cmd/agent/app/reporter/grpc.(*Reporter).EmitBatch\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/reporter/grpc/reporter.go:53\ngithub.com/jaegertracing/jaeger/cmd/agent/app/reporter.(*MetricsReporter).EmitBatch\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/reporter/metrics.go:85\ngithub.com/jaegertracing/jaeger/cmd/agent/app/reporter.(*ClientMetricsReporter).EmitBatch\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/reporter/client_metrics.go:121\ngithub.com/jaegertracing/jaeger/thrift-gen/agent.(*agentProcessorEmitBatch).Process\n\tgithub.com/jaegertracing/jaeger/thrift-gen/agent/agent.go:157\ngithub.com/jaegertracing/jaeger/thrift-gen/agent.(*AgentProcessor).Process\n\tgithub.com/jaegertracing/jaeger/thrift-gen/agent/agent.go:112\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:122\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}

To Reproduce Steps to reproduce the behavior:

  1. Deploy Jaeger using the Jaeger Operator
  2. Create a POD application and inject Jaeger
  3. Report the data through a Python script
    
    import logging
    import time
    from jaeger_client import Config

if name == "main": log_level = logging.DEBUG logging.getLogger('').handlers = [] logging.basicConfig(format='%(asctime)s %(message)s', level=log_level)

config = Config(
    config={
        'sampler': {
            'type': 'const',
            'param': 1,
        },
        'logging': True,
    },
    service_name='myapp',
    validate=True,
)
tracer = config.initialize_tracer()

with tracer.start_span('TestSpan') as span:
    span.log_kv({'event': 'test message', 'life': 42})

    with tracer.start_span('ChildSpan', child_of=span) as child_span:
        child_span.log_kv({'event': 'down below'})

time.sleep(2)
tracer.close()

4. Check the POD internal Jaeger-Agent container log

**Expected behavior**
After running the Python script, the client reports the data to the jaeger-agent, which then successfully reports the data to the controller

**Screenshots**
![image](https://user-images.githubusercontent.com/56722758/113485304-63540280-94df-11eb-8199-9c864d606903.png)

**Version (please complete the following information):**
 - OS: Centos7.4
 - Jaeger version: 1.22
 - Deployment: Kubernetes1.16

**What troubleshooting steps did you try?**
Try to follow https://www.jaegertracing.io/docs/latest/troubleshooting/ and describe how far you were able to progress and/or which steps did not work.

**Additional context**
Add any other context about the problem here.
PuppetA17 commented 3 years ago

The follow is jaeger-agent containers manifets yaml:

  - args:
    - --jaeger.tags=cluster=undefined,deployment.name=myapp,pod.namespace=observability,pod.name=${POD_NAME:},host.ip=${HOST_IP:},container.name=myapp
    - --reporter.grpc.host-port=dns:///jaeger-collector-headless.observability.svc:14250
    env:
    - name: POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: HOST_IP
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.hostIP
    image: jaegertracing/jaeger-agent:1.20.0
    imagePullPolicy: IfNotPresent
    name: jaeger-agent
    ports:
    - containerPort: 5775
      hostPort: 5775
      name: zk-compact-trft
      protocol: UDP
    - containerPort: 5778
      hostPort: 5778
      name: config-rest
      protocol: TCP
    - containerPort: 6831
      hostPort: 6831
      name: jg-compact-trft
      protocol: UDP
    - containerPort: 6832
      hostPort: 6832
      name: jg-binary-trft
      protocol: UDP
    - containerPort: 14271
      hostPort: 14271
      name: admin-http
      protocol: TCP
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-4rwmk
      readOnly: true
jpkrohling commented 3 years ago

Could you give us the output of kubectl get services -n observability? Are you able to consistently reproduce it using recent versions of minikube?

hwanghoward commented 3 years ago

My k8s version is 1.20.4, jaeger version is v1.22.0. When I use jaeger operator to create agent, set the strategy to "DaemonSet" and set the "hostNetwork: true", I also met the same problem. I checked the /etc/resolv.conf and found that the pod of agent uses the resolv.conf of the node. So I changed the dnsPolicy of agent daemonset from "ClusterFirst" to "ClusterFirstWithHostNet", then the problem resolved.