Closed cloudcafetech closed 3 years ago
Hi! If I'm not mistaken, your ingress is only allowing traffic through port 3100. The Grafana Agent exports traces via gRPC using OTLP, which default port is 55680. Can you try opening that port for ingress too?
I am not too much expert in ingress. Not sure can setup multiple port in single ingress.
PLEASE HELP.🙏
You mean to say, in ingress if I use 55680 instead of 3100 .. it will work?
Not sure can setup multiple port in single ingress
The ingress spec does not allow multiple backends in the same path. I don't think that's possible. You would need to set up different paths or use a load balancer I guess.
You mean to say, in ingress if I use 55680 instead of 3100 .. it will work?
I think that should work, yea. Although you won't be able to access the distributors on 3100.
:( not working .. same error ...
ts=2021-06-15T10:24:31Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp: address grafanaclient.172.31.25.50.nip.io: missing port in address\""
ts=2021-06-15T10:24:41Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp: address grafanaclient.172.31.25.50.nip.io: missing port in address\""
ts=2021-06-15T10:24:51Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp: address grafanaclient.172.31.25.50.nip.io: missing port in address\""
oc get ing grafanaclient -n tracing -o yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: nginx
creationTimestamp: "2021-06-15T10:07:24Z"
generation: 1
name: grafanaclient
namespace: tracing
resourceVersion: "1866"
uid: 57b33094-b19c-4c73-a203-3b98bfd3e4c9
spec:
rules:
- host: grafanaclient.172.31.25.50.nip.io
http:
paths:
- backend:
service:
name: tempo-distributor
port:
number: 55680
path: /
pathType: Prefix
status:
loadBalancer:
ingress:
- hostname: localhost
Sorry, it was my bad. I did not see the entire error message.
Error while dialing dial tcp: address grafanaclient.172.31.25.50.nip.io: missing port in address
You need to add :55680
to the remote_write
endpoint:
remote_write:
- endpoint: tempo.172.31.14.138.nip.io:55680
Still, I think the ingress change was necessary. Let me know if it works now.
Tried both grafanaclient.172.31.25.50.nip.io:80
& grafanaclient.172.31.25.50.nip.io:55680
but not working.
new error ...
ts=2021-06-15T10:52:02Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 172.31.25.50:55680: i/o timeout\""
ts=2021-06-15T10:52:12Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 172.31.25.50:55680: i/o timeout\""
ts=2021-06-15T10:52:22Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 172.31.25.50:55680: i/o timeout\""
ts=2021-06-15T10:52:37Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
ts=2021-06-15T10:52:47Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
ts=2021-06-15T10:52:57Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
ts=2021-06-15T10:53:07Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
ts=2021-06-15T10:53:17Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
with below config ...
remote_write:
- endpoint: grafanaclient.172.31.25.50.nip.io:80
insecure: true
#basic_auth:
#password: ${TEMPO_PASSWORD}
#username: ${TEMPO_USERNAME}
retry_on_failure:
enabled: false
getting below error ...
ts=2021-06-15T10:57:40Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection closed"
ts=2021-06-15T10:57:50Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection closed"
ts=2021-06-15T10:58:00Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection closed"
ts=2021-06-15T10:58:10Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection closed"
ts=2021-06-15T10:58:20Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection closed"
It should be pointing to grafanaclient.172.31.25.50.nip.io:55680
. Can you make sure that grafanaclient.172.31.25.50.nip.io:55680
is reachable from one cluster to the other?
I don't think issue is related with reachability because it's using ingress. Other ingress (cortex & Loki) working perfectly.
Secondly if I mention port 55680 I got timeout error thought to me it' not valid port (55680) after endpoint because it's already take care by service with ingress. But by using 80 I got deadline exceeded error.
Please check below ..
ts=2021-06-15T10:52:02Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 172.31.25.50:55680: i/o timeout\""
ts=2021-06-15T10:52:12Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 172.31.25.50:55680: i/o timeout\""
ts=2021-06-15T10:52:22Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 172.31.25.50:55680: i/o timeout\""
ts=2021-06-15T10:52:37Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
ts=2021-06-15T10:52:47Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
ts=2021-06-15T10:52:57Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
ts=2021-06-15T10:53:07Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
ts=2021-06-15T10:53:17Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
I setup ingress with GRPC enable on port 443 (Ingress GRPC does not support on 80) then tested with following scenario but NO luck
ingress endpoint with 55680 & insecure=true
ts=2021-06-16T01:06:29Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 172.31.27.7:55680: connect: connection refused\""
ts=2021-06-16T01:06:39Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 172.31.27.7:55680: connect: connection refused\""
ts=2021-06-16T01:06:49Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 172.31.27.7:55680: connect: connection refused\""
ts=2021-06-16T01:06:59Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 172.31.27.7:55680: connect: connection refused\""
ingress endpoint with 443 & insecure=false
ts=2021-06-16T01:09:24Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection error: desc = \"transport: authentication handshake failed: x509: certificate is valid for ingress.local, not grafanaclient.172.31.27.7.nip.io\""
ingress endpoint with 443 & insecure=true
ts=2021-06-16T01:12:29Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection closed"
ts=2021-06-16T01:12:39Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection closed"
ts=2021-06-16T01:12:49Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection closed"
ts=2021-06-16T01:12:59Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection closed"
To me issue in OLTP exporter code, Does it support any other port?
Have you tested using my scenario (using ingress)? Can you simulate at your end PLEASE.
It sounds like there's been a lot of changes to your config since the issue has been opened.
Can you share the latest for all of the following:
Also note that the NGINX ingress controller does not enable HTTP/2 traffic by default. If you're using an OTLP receiver in Tempo, you'll need an extra annotation on your ingress to enable gRPC.
apiVersion: v1
kind: ServiceAccount
metadata:
name: tempo
---
apiVersion: v1
kind: ConfigMap
metadata:
labels:
app.kubernetes.io/instance: tempo
app.kubernetes.io/name: tempo
name: tempo
data:
overrides.yaml: |
overrides:
tempo.yaml: |
auth_enabled: false
compactor:
compaction:
compacted_block_retention: 48h
distributor:
receivers:
jaeger:
protocols:
thrift_compact:
endpoint: 0.0.0.0:6831
thrift_binary:
endpoint: 0.0.0.0:6832
thrift_http:
endpoint: 0.0.0.0:14268
grpc:
endpoint: 0.0.0.0:14250
zipkin:
endpoint: 0.0.0.0:9411
otlp:
protocols:
http:
endpoint: 0.0.0.0:55681
grpc:
endpoint: 0.0.0.0:4317
opencensus:
endpoint: 0.0.0.0:55678
ingester:
{}
server:
http_listen_port: 3100
storage:
trace:
backend: s3
local:
path: /var/tempo/traces
s3:
access_key: admin
bucket: tracing
endpoint: 172.31.44.216:9000
insecure: true
secret_key: admin2675
wal:
path: /var/tempo/wal
---
apiVersion: v1
kind: ConfigMap
metadata:
labels:
app.kubernetes.io/instance: tempo
app.kubernetes.io/name: tempo
name: tempo-query
data:
tempo-query.yaml: |
backend: tempo:3100
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app.kubernetes.io/instance: tempo
app.kubernetes.io/name: tempo
name: tempo
spec:
podManagementPolicy: OrderedReady
replicas: 1
selector:
matchLabels:
app.kubernetes.io/instance: tempo
app.kubernetes.io/name: tempo
serviceName: tempo-headless
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/instance: tempo
app.kubernetes.io/name: tempo
spec:
containers:
- args:
- -config.file=/conf/tempo.yaml
- -mem-ballast-size-mbs=1024
image: grafana/tempo:1.0.0
imagePullPolicy: IfNotPresent
name: tempo
ports:
- containerPort: 3100
name: prom-metrics
protocol: TCP
- containerPort: 6831
name: jaeger-thrift-c
protocol: UDP
- containerPort: 6832
name: jaeger-thrift-b
protocol: UDP
- containerPort: 14268
name: jaeger-thrift-h
protocol: TCP
- containerPort: 14250
name: jaeger-grpc
protocol: TCP
- containerPort: 9411
name: zipkin
protocol: TCP
- containerPort: 55680
name: otlp-legacy
protocol: TCP
- containerPort: 4317
name: otlp-grpc
protocol: TCP
- containerPort: 55681
name: otlp-http
protocol: TCP
- containerPort: 55678
name: opencensus
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /conf
name: tempo-conf
- args:
- --query.base-path=/
- --grpc-storage-plugin.configuration-file=/conf/tempo-query.yaml
image: grafana/tempo-query:1.0.0
imagePullPolicy: IfNotPresent
name: tempo-query
ports:
- containerPort: 16686
name: jaeger-ui
protocol: TCP
- containerPort: 16687
name: jaeger-metrics
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /conf
name: tempo-query-conf
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: tempo
serviceAccountName: tempo
terminationGracePeriodSeconds: 30
volumes:
- configMap:
defaultMode: 420
name: tempo-query
name: tempo-query-conf
- configMap:
defaultMode: 420
name: tempo
name: tempo-conf
updateStrategy:
type: RollingUpdate
---
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/instance: tempo
app.kubernetes.io/name: tempo
name: tempo
spec:
ports:
- name: tempo-prom-metrics
port: 3100
protocol: TCP
targetPort: 3100
- name: tempo-query-jaeger-ui
port: 16686
protocol: TCP
targetPort: 16686
- name: tempo-jaeger-thrift-compact
port: 6831
protocol: UDP
targetPort: 6831
- name: tempo-jaeger-thrift-binary
port: 6832
protocol: UDP
targetPort: 6832
- name: tempo-jaeger-thrift-http
port: 14268
protocol: TCP
targetPort: 14268
- name: tempo-jaeger-grpc
port: 14250
protocol: TCP
targetPort: 14250
- name: tempo-zipkin
port: 9411
protocol: TCP
targetPort: 9411
- name: tempo-otlp-legacy
port: 55680
protocol: TCP
targetPort: 55680
- name: tempo-otlp-http
port: 55681
protocol: TCP
targetPort: 55681
- name: tempo-otlp-grpc
port: 4317
protocol: TCP
targetPort: 4317
- name: tempo-opencensus
port: 55678
protocol: TCP
targetPort: 55678
selector:
app.kubernetes.io/instance: tempo
app.kubernetes.io/name: tempo
sessionAffinity: None
type: ClusterIP
[root@ip-172-31-44-216 ~]# oc get po,svc,ep,ing -n tracing
NAME READY STATUS RESTARTS AGE
pod/tempo-0 2/2 Running 0 7m10s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/tempo ClusterIP 10.96.119.238 <none> 3100/TCP,16686/TCP,6831/UDP,6832/UDP,14268/TCP,14250/TCP,9411/TCP,55680/TCP,55681/TCP,4317/TCP,55678/TCP 7m10s
NAME ENDPOINTS AGE
endpoints/tempo 10.244.1.8:14268,10.244.1.8:14250,10.244.1.8:55680 + 8 more... 7m10s
NAME CLASS HOSTS ADDRESS PORTS AGE
ingress.networking.k8s.io/grafanaclient <none> grafanaclient.172.31.44.216.nip.io localhost 80, 443 3m3s
apiVersion: v1
kind: ServiceAccount
metadata:
name: grafana-agent-traces
namespace: monitoring
---
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-agent-traces
namespace: monitoring
data:
agent.yaml: |
server:
http_listen_port: 8080
log_level: info
tempo:
configs:
- batch:
send_batch_size: 1000
timeout: 5s
name: default
receivers:
jaeger:
protocols:
grpc: null
thrift_binary: null
thrift_compact: null
thrift_http: null
remote_sampling:
insecure: true
strategy_file: /etc/agent/strategies.json
opencensus: null
otlp:
protocols:
grpc: null
http: null
zipkin: null
attributes:
actions:
- action: upsert
key: cluster
value: kube-one
remote_write:
- endpoint: grafanaclient.172.31.44.216.nip.io:443
insecure: true
#basic_auth:
#password: ${TEMPO_PASSWORD}
#username: ${TEMPO_USERNAME}
retry_on_failure:
enabled: false
scrape_configs:
- bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
job_name: kubernetes-pods
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: false
strategies.json: '{"default_strategy": {"param": 0.001, "type": "probabilistic"}}'
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: grafana-agent-traces
rules:
- apiGroups:
- ""
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs:
- get
- list
- watch
- nonResourceURLs:
- /metrics
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: grafana-agent-traces
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: grafana-agent-traces
subjects:
- kind: ServiceAccount
name: grafana-agent-traces
namespace: monitoring
---
apiVersion: v1
kind: Service
metadata:
labels:
name: grafana-agent-traces
name: grafana-agent-traces
namespace: monitoring
spec:
ports:
- name: agent-http-metrics
port: 8080
targetPort: 8080
- name: agent-thrift-compact
port: 6831
protocol: UDP
targetPort: 6831
- name: agent-thrift-binary
port: 6832
protocol: UDP
targetPort: 6832
- name: agent-thrift-http
port: 14268
protocol: TCP
targetPort: 14268
- name: agent-thrift-grpc
port: 14250
protocol: TCP
targetPort: 14250
- name: agent-zipkin
port: 9411
protocol: TCP
targetPort: 9411
- name: agent-otlp
port: 55680
protocol: TCP
targetPort: 55680
- name: agent-opencensus
port: 55678
protocol: TCP
targetPort: 55678
selector:
name: grafana-agent-traces
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: grafana-agent-traces
namespace: monitoring
spec:
minReadySeconds: 10
selector:
matchLabels:
name: grafana-agent-traces
template:
metadata:
labels:
name: grafana-agent-traces
spec:
containers:
- args:
- -config.file=/etc/agent/agent.yaml
command:
- /bin/agent
env:
- name: HOSTNAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
image: grafana/agent:v0.15.0
imagePullPolicy: IfNotPresent
name: agent
ports:
- containerPort: 8080
name: http-metrics
- containerPort: 6831
name: thrift-compact
protocol: UDP
- containerPort: 6832
name: thrift-binary
protocol: UDP
- containerPort: 14268
name: thrift-http
protocol: TCP
- containerPort: 14250
name: thrift-grpc
protocol: TCP
- containerPort: 9411
name: zipkin
protocol: TCP
- containerPort: 55680
name: otlp
protocol: TCP
- containerPort: 55678
name: opencensus
protocol: TCP
volumeMounts:
- mountPath: /etc/agent
name: grafana-agent-traces
serviceAccount: grafana-agent-traces
tolerations:
- effect: NoSchedule
operator: Exists
volumes:
- configMap:
name: grafana-agent-traces
name: grafana-agent-traces
updateStrategy:
type: RollingUpdate
[root@ip-172-31-44-216 demo]# oc get po,svc,ep -n monitoring
NAME READY STATUS RESTARTS AGE
pod/grafana-agent-traces-6rx5s 1/1 Running 0 2m56s
pod/grafana-agent-traces-bhsp5 1/1 Running 0 2m56s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/grafana-agent-traces ClusterIP 10.96.133.212 <none> 8080/TCP,6831/UDP,6832/UDP,14268/TCP,14250/TCP,9411/TCP,55680/TCP,55678/TCP 2m56s
NAME ENDPOINTS AGE
endpoints/grafana-agent-traces 10.244.0.2:14250,10.244.1.9:14250,10.244.0.2:6832 + 13 more... 2m56s
[root@ip-172-31-44-216 demo]# oc logs -f grafana-agent-traces-6rx5s -n monitoring
ts=2021-06-16T10:02:19.17947343Z level=info agent=prometheus component=cluster msg="applying config"
ts=2021-06-16T10:02:19.179737433Z level=info agent=prometheus component=cluster msg="not watching the KV, none set"
ts=2021-06-16T10:02:19Z level=info msg="Tempo Logger Initialized" component=tempo
ts=2021-06-16T10:02:19Z level=info msg="shutting down receiver" component=tempo tempo_config=default
ts=2021-06-16T10:02:19Z level=info msg="shutting down processors" component=tempo tempo_config=default
ts=2021-06-16T10:02:19Z level=info msg="shutting down exporters" component=tempo tempo_config=default
ts=2021-06-16T10:02:19Z level=info msg="Exporter is enabled." component=tempo tempo_config=default component_kind=exporter exporter=otlp/0
ts=2021-06-16T10:02:19Z level=info msg="Exporter is starting..." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0
ts=2021-06-16T10:02:19Z level=info msg="Exporter started." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0
ts=2021-06-16T10:02:19.184725605Z level=info component="tempo service disco" discovery=kubernetes msg="Using pod service account via in-cluster config"
ts=2021-06-16T10:02:19Z level=info msg="Pipeline is enabled." component=tempo tempo_config=default pipeline_name=traces pipeline_datatype=traces
ts=2021-06-16T10:02:19Z level=info msg="Pipeline is starting..." component=tempo tempo_config=default pipeline_name=traces pipeline_datatype=traces
ts=2021-06-16T10:02:19Z level=info msg="Pipeline is started." component=tempo tempo_config=default pipeline_name=traces pipeline_datatype=traces
ts=2021-06-16T10:02:19Z level=info msg="Receiver is enabled." component=tempo tempo_config=default component_kind=receiver component_type=jaeger component_name=jaeger datatype=traces
ts=2021-06-16T10:02:19Z level=info msg="Receiver is enabled." component=tempo tempo_config=default component_kind=receiver component_type=opencensus component_name=opencensus datatype=traces
ts=2021-06-16T10:02:19Z level=info msg="Receiver is enabled." component=tempo tempo_config=default component_kind=receiver component_type=otlp component_name=otlp datatype=traces
ts=2021-06-16T10:02:19Z level=info msg="Receiver is enabled." component=tempo tempo_config=default component_kind=receiver component_type=zipkin component_name=zipkin datatype=traces
ts=2021-06-16T10:02:19Z level=info msg="Receiver is starting..." component=tempo tempo_config=default component_kind=receiver component_type=zipkin component_name=zipkin
ts=2021-06-16T10:02:19Z level=info msg="Receiver started." component=tempo tempo_config=default component_kind=receiver component_type=zipkin component_name=zipkin
ts=2021-06-16T10:02:19Z level=info msg="Receiver is starting..." component=tempo tempo_config=default component_kind=receiver component_type=jaeger component_name=jaeger
ts=2021-06-16T10:02:19Z level=info msg="Receiver started." component=tempo tempo_config=default component_kind=receiver component_type=jaeger component_name=jaeger
ts=2021-06-16T10:02:19Z level=info msg="Receiver is starting..." component=tempo tempo_config=default component_kind=receiver component_type=opencensus component_name=opencensus
ts=2021-06-16T10:02:19Z level=info msg="Receiver started." component=tempo tempo_config=default component_kind=receiver component_type=opencensus component_name=opencensus
ts=2021-06-16T10:02:19Z level=info msg="Receiver is starting..." component=tempo tempo_config=default component_kind=receiver component_type=otlp component_name=otlp
ts=2021-06-16T10:02:19Z level=info msg="Starting GRPC server on endpoint 0.0.0.0:4317" component=tempo tempo_config=default component_kind=receiver component_type=otlp component_name=otlp
ts=2021-06-16T10:02:19Z level=info msg="Setting up a second GRPC listener on legacy endpoint 0.0.0.0:55680" component=tempo tempo_config=default component_kind=receiver component_type=otlp component_name=otlp
ts=2021-06-16T10:02:19Z level=info msg="Starting GRPC server on endpoint 0.0.0.0:55680" component=tempo tempo_config=default component_kind=receiver component_type=otlp component_name=otlp
ts=2021-06-16T10:02:19Z level=info msg="Starting HTTP server on endpoint 0.0.0.0:55681" component=tempo tempo_config=default component_kind=receiver component_type=otlp component_name=otlp
ts=2021-06-16T10:02:19Z level=info msg="Receiver started." component=tempo tempo_config=default component_kind=receiver component_type=otlp component_name=otlp
ts=2021-06-16T10:02:19.188933259Z level=info msg="server configuration changed, restarting server"
ts=2021-06-16T10:02:19.189135013Z level=info caller=server.go:245 http=[::]:8080 grpc=[::]:9095 msg="server listening on addresses"
ts=2021-06-16T10:07:39Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection closed"
ts=2021-06-16T10:09:24Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection closed"
ts=2021-06-16T10:09:34Z level=error msg="Exporting failed. Try enabling retry_on_failure config option." component=tempo tempo_config=default component_kind=exporter component_type=otlp component_name=otlp/0 error="failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection closed"
The first thing that jumps out to me here is that you have inescure
set in your tracing remote_write on the Agent. That disables TLS, but your ingress uses TLS; that's going to cause a handshake error and for the connection to be refused.
I'd also recommend against using 55680 for your service/ingress. It works for now, but 55680 will likely eventually be removed. You should consider using the explicit 4317 port you configured instead.
Configure ingress with 4317 port ? Same GRPC protocol.
PLEASE help me to make it work.
I want to give you the benefit if the doubt, but the way you keep asking for help comes off as entitled. Maybe it's unintentional; that's fine, everyone makes mistakes and communication is hard. But intentional or not: please stop, because it's not helping me be interested in helping you. This repository is a free support channel, and we are already taking the time out of our days to help you. If you don't stop behaving this way, I am going to close this issue.
With that out of the way, please carefully re-read what I said. 55680 will work for now, but is deprecated and will eventually be removed. This has nothing to do with your current problem, but I was giving you a heads up that it may give you problems down the road.
The insecure
line being set to true is more likely to be one of the causes of your problems, as I mentioned.
Then based on your below comment
The first thing that jumps out to me here is that you have inescure set in your tracing remote_write on the Agent. That disables TLS, but your ingress uses TLS; that's going to cause a handshake error and for the connection to be refused.
Ingress support GRPC over 443 not 80. So I have to use ingress with TLS to make it work no alternate.
Now how can I use setup tracing agent with secure. Any pointer will help.
Anyway thanks for your time.
You can just remove the insecure: true
line from your remote_write config. If you're not using a valid TLS certificate for the ingress, you'll need to set insecure_skip_verify: true
on the remote_write config as well.
I'll open an issue to add support for custom CAs in Tempo's remote_write, since it looks like we don't currently support that. (Edit: that issue is #662)
Thanks, let me try that. just wanted tell you I can't use 55680 as with distributed deployment endpoint NOT available https://github.com/grafana/tempo/issues/768 . Expected it will work with 4317 with ingress.
Did not see any error, seem data is coming VERY VERY late in s3 (Minio). some error message .. (updated)
ts=2021-06-17T02:28:47.899282321Z level=info caller=server.go:245 http=[::]:8080 grpc=[::]:9095 msg="server listening on addresses"
ts=2021-06-17T02:58:49.124181601Z level=warn agent=prometheus component=cleaner msg="unable to find segment mtime of WAL" name=. err="unable to open WAL: open wal: no such file or directory"
ts=2021-06-17T02:58:49.124251231Z level=warn agent=prometheus component=cleaner msg="unable to find segment mtime of WAL" name=bin err="unable to open WAL: open bin/wal: no such file or directory"
ts=2021-06-17T02:58:49.124273546Z level=warn agent=prometheus component=cleaner msg="unable to find segment mtime of WAL" name=boot err="unable to open WAL: open boot/wal: no such file or directory"
ts=2021-06-17T02:58:49.124290839Z level=warn agent=prometheus component=cleaner msg="unable to find segment mtime of WAL" name=dev err="unable to open WAL: open dev/wal: no such file or directory"
ts=2021-06-17T02:58:49.124318071Z level=warn agent=prometheus component=cleaner msg="unable to find segment mtime of WAL" name=etc err="unable to open WAL: open etc/wal: no such file or directory"
ts=2021-06-17T02:58:49.12434801Z level=warn agent=prometheus component=cleaner msg="unable to find segment mtime of WAL" name=home err="unable to open WAL: open home/wal: no such file or directory"
ts=2021-06-17T02:58:49.124442774Z level=warn agent=prometheus component=cleaner msg="unable to find segment mtime of WAL" name=lib err="unable to open WAL: open lib/wal: no such file or directory"
ts=2021-06-17T02:58:49.124471578Z level=warn agent=prometheus component=cleaner msg="unable to find segment mtime of WAL" name=lib64 err="unable to open WAL: open lib64/wal: no such file or directory"
ts=2021-06-17T02:58:49.124493207Z level=warn agent=prometheus component=cleaner msg="unable to find segment mtime of WAL" name=media err="unable to open WAL: open media/wal: no such file or directory"
As per my understanding tracing getting tracing data from application and storing inside POD (tracing agent) then doing remote write Tempo; right ? if yes, then may I know storing data local path in POD ?
And data location of in Tempo (Ingester)? is it in /var/tempo/wal ?
@robx
If you get a time please reply my query, hence closing ...
Thank you very much for your valuable support :)
My OSS Tempo running in different cluster (kube-central) and created tempo ingress (tempo.172.31.25.28.nip.io) pointing service tempo distributor (
tempo-distributor
) on port3100
Now on different host (kube-one) running tracing agent and getting below error .. Due to that no data found in s3 (minio) bucket.