mesosphere / kubernetes-mesos

A Kubernetes Framework for Apache Mesos
637 stars 92 forks source link

cancelled executor timeouts spam the executor log file #507

Open s-urbaniak opened 8 years ago

s-urbaniak commented 8 years ago

When the launch grace period is reached the cancelled pod task reaches the LOST state. But in the executor the following log entries are being produced indefinitely:

W0924 12:42:49.637230     179 kubelet.go:1509] Orphaned volume "206d4abc-62b9-11e5-ba8d-0242ac11000e/etcd-storage.deleting~082257403.deleting~230255557.deleting~140678590" found, tearing down volume
E0924 12:42:49.645638     179 kubelet.go:1515] Could not tear down volume "206d4abc-62b9-11e5-ba8d-0242ac11000e/etcd-storage.deleting~082257403.deleting~230255557.deleting~140678590": rename /var/tmp/mesos/1/slaves/20150924-123811-201331116-5050-1-S0/frameworks/20150924-123811-201331116-5050-1-0000/executors/d8d9a6669223e699_k8sm-executor/runs/ec939547-bacd-4c9e-92bd-5ace27a7af35/pods/206d4abc-62b9-11e5-ba8d-0242ac11000e/volumes/kubernetes.io~empty-dir/etcd-storage.deleting~082257403.deleting~230255557.deleting~140678590 /var/tmp/mesos/1/slaves/20150924-123811-201331116-5050-1-S0/frameworks/20150924-123811-201331116-5050-1-0000/executors/d8d9a6669223e699_k8sm-executor/runs/ec939547-bacd-4c9e-92bd-5ace27a7af35/pods/206d4abc-62b9-11e5-ba8d-0242ac11000e/volumes/kubernetes.io~empty-dir/etcd-storage.deleting~082257403.deleting~230255557.deleting~140678590.deleting~350073002: file exists
... (above two lines repeated indefinitely)
s-urbaniak commented 8 years ago

@jdef any idea what could go wrong here?

s-urbaniak commented 8 years ago

xref https://github.com/kubernetes/kubernetes/pull/14432, https://github.com/mesosphere/kubernetes-mesos/issues/492

jdef commented 8 years ago

I'm not very familiar with the kubelet volume plugins. Looks like a possible (race-related?) bug in the kubelet's volume manager? That file name is suspicious (all the .deleting.xxx suffixes).