Closed antoineco closed 7 years ago
Extra info: the query used on InfluxDB 0.8 (Kubernetes 1.1)
SELECT distinct(container_name) from "uptime_ms_cumulative"
WHERE pod_name =~ /api-12345/
AND "pod_namespace" =~ /my_namespace/
AND time > now() - 5m
Result: -> {rails,logrotate,logforwarder}
cc @thucatebay
Found it.
Heapster is struggling to write data for this one group of pods. My heapster logs are full of:
driver.go:207] failed to write stats to influxDB - {"error":"partial write:\nunable to parse 'uptime_ms_cumulative,container_base_image=example.com/image:tag,container_name=rails,...\"containerID\":\"docker://08b6bbb923aef450d50fee92038132601f34425a1b475311b3a4d47e40a82252\"}}\\,\"ready\":true\\,\"restartCount\":2\\,\"image\":\"example.com/image:tag\"\\,\"imageID\":\"docker://47299f016d3729dcc5d8033c3db7d9ddf130f22cc9e7a3008cdcd00320ac094b\"\\,\"containerID\":\"docker://3edcf5685d7961336d4f180dfdd3d976c60f987a4247c00fbef74f6176a26016\"}]}}': missing fields"}
Current pod definition:
apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/created-by: |
{"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"bodyweight-api","name":"api","uid":"7dec1239-a3ff-11e5-9f88-0a59d1e77755","apiVersion":"v1","resourceVersion":"40275725"}}
creationTimestamp: 2015-12-18T22:40:00Z
generateName: api-
labels:
app: fl-backend-rails
component: api
deployment: "33"
name: api-3p9gu
namespace: bodyweight-api
resourceVersion: "40275903"
selfLink: /api/v1/namespaces/bodyweight-api/pods/api-3p9gu
uid: 45b03770-a5d8-11e5-9f88-0a59d1e77755
spec:
containers:
- env:
- name: ROLE
value: api
- name: RAILS_ENV
value: production
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
image: example.com/image:tag
imagePullPolicy: IfNotPresent
lifecycle:
preStop:
exec:
command:
- /usr/sbin/nginx
- -s
- quit
name: rails
ports:
- containerPort: 9080
name: http
protocol: TCP
resources:
limits:
cpu: 400m
memory: 1800Mi
requests:
cpu: 400m
memory: 1800Mi
terminationMessagePath: /dev/termination-log
volumeMounts:
- mountPath: /run/secrets/example.com/rails
name: rails-secrets
readOnly: true
- mountPath: /app/log
name: rails-logs
- mountPath: /var/log/nginx
name: nginx-logs
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-68jsr
readOnly: true
- image: apopelo/logstash-forwarder
imagePullPolicy: IfNotPresent
name: logstash-forwarder
resources:
limits:
cpu: 5m
memory: 15Mi
requests:
cpu: 5m
memory: 15Mi
terminationMessagePath: /dev/termination-log
volumeMounts:
- mountPath: /var/log/containers/rails
name: rails-logs
readOnly: true
- mountPath: /var/log/containers/nginx
name: nginx-logs
readOnly: true
- mountPath: /etc/logstash-forwarder
name: logstash-conf
readOnly: true
- mountPath: /etc/ssl/logstash-forwarder
name: logstash-ssl
readOnly: true
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-68jsr
readOnly: true
- image: example.com/logrotate
imagePullPolicy: IfNotPresent
name: logrotate
resources:
limits:
cpu: 5m
memory: 60Mi
requests:
cpu: 5m
memory: 60Mi
terminationMessagePath: /dev/termination-log
volumeMounts:
- mountPath: /var/log/containers/rails
name: rails-logs
- mountPath: /var/log/containers/nginx
name: nginx-logs
- mountPath: /etc/logrotate.d
name: logrotate-d
readOnly: true
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-68jsr
readOnly: true
dnsPolicy: ClusterFirst
imagePullSecrets:
- name: docker-registry
nodeName: ip-10-0-0-1.eu-west-1.compute.internal
restartPolicy: Always
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
volumes:
- name: rails-secrets
secret:
secretName: rails
- emptyDir: {}
name: rails-logs
- emptyDir: {}
name: nginx-logs
- name: logstash-conf
secret:
secretName: logstash-conf
- name: logstash-ssl
secret:
secretName: logstash-ssl
- name: logrotate-d
secret:
secretName: logrotate-d
- name: default-token-68jsr
secret:
secretName: default-token-68jsr
status:
conditions:
- lastProbeTime: null
lastTransitionTime: null
status: "True"
type: Ready
containerStatuses:
- containerID: docker://5e35e8d3d447e21a2f48473cc92008704ae6cac2412b1e74c1b93a736a028766
image: example.com/logrotate
imageID: docker://6c9c7a7a9c779eafa7123550b44967a18f1d43b5125acbcc317903b82b5800cf
lastState: {}
name: logrotate
ready: true
restartCount: 0
state:
running:
startedAt: 2015-12-18T22:40:04Z
- containerID: docker://0181d898943aef21d9da96f413586c1f63bb401f1629a027e08d7f886fba6f5d
image: apopelo/logstash-forwarder
imageID: docker://32be67e30853d07971c5df6e7cc55607c946aa3c4f1d4b408a5aca18ff760fd5
lastState: {}
name: logstash-forwarder
ready: true
restartCount: 0
state:
running:
startedAt: 2015-12-18T22:40:04Z
- containerID: docker://159ebe374104eb05ef3ee8ca469788240a6508e3938f2b6be9c78600f409610d
image: example.com/image:tag
imageID: docker://f365817002eb3ccb8f91ead008dfee26e5b073dfa301b8b10589bd7632ad86d8
lastState: {}
name: rails
ready: true
restartCount: 0
state:
running:
startedAt: 2015-12-18T22:40:04Z
hostIP: 10.0.0.1
phase: Running
podIP: 172.17.9.8
startTime: 2015-12-18T22:40:00Z
Looks very related to #775
I suspect the lifecycle hook to be the culprit, the following block doesn't exist in a similar pod, and the metrics are exported properly.
lifecycle:
preStop:
exec:
command: ["/usr/sbin/nginx","-s","quit"]
edit: @vishh my assumption was right, I removed that block from my RC and the metrics are now forwarded.
Is this a bug? I'm also having the same issue and once the lifecycle block was removed stats were correctly published. But I need the lifecycle hook to be present in the configuration.
@piosz closing this as well, see #775
I switched my Heapster + InfluxDB setup from what was deployed by Kubernetes 1.1 to what I found here @ master.
The Containers dashboard, which I also loaded from the current master, is missing some containers when looking at the details of a couple of my pods.
Example: pod
api-12345
runs 3 containers:rails
,logrotate
,logforwarder
The query which is used for the $container variable in Grafana (Templating > Variables) yields incomplete results:
Result: {logrotate,logforwarder} Expected: -> {rails,logrotate,logforwarder}
I'm not sure yet if this comes from the change in the Grafana queries or if InfluxDB is fed incomplete information by Heapster. I will look into it, but any feedback from other users would be appreciated.