Open jdef opened 9 years ago
reproduced on latest upstream_k8sm build. in this case it appears that the slave recognizes that it was asked to kill the task, but the executor never logged that it received the message.
apiserver.log
I0612 03:03:52.249961 25445 handlers.go:135] PUT /api/v1/namespaces/default/events/frontend-controller-ljukf.13e6de8db35da993: (1.507113ms) 200 [[km/v0.19.0 (linux/amd64) kubernetes/unknown] 10.2.0.5:47106]
I0612 03:03:53.910435 25445 handlers.go:135] PUT /api/v1/namespaces/default/events/frontend-controller-ljukf.13e6de8db35da993: (1.545246ms) 200 [[km/v0.19.0 (linux/amd64) kubernetes/unknown] 10.2.0.5:47106]
I0612 03:03:55.869278 25445 handlers.go:135] PUT /api/v1/namespaces/default/events/frontend-controller-ljukf.13e6de8db35da993: (1.650344ms) 200 [[km/v0.19.0 (linux/amd64) kubernetes/unknown] 10.2.0.5:47106]
I0612 03:04:01.076454 25445 handlers.go:135] PUT /api/v1/namespaces/default/events/frontend-controller-ljukf.13e6de8db35da993: (1.536159ms) 200 [[km/v0.19.0 (linux/amd64) kubernetes/unknown] 10.2.0.5:47106]
I0612 03:04:07.593754 25445 handlers.go:135] GET /api/v1/namespaces/default/pods/frontend-controller-ljukf: (765.441µs) 200 [[executor/v0.19.0 (linux/amd64) kubernetes/unknown] 10.2.0.5:47654]
I0612 03:04:07.596565 25445 handlers.go:135] PUT /api/v1/namespaces/default/pods/frontend-controller-ljukf/status: (2.296222ms) 200 [[executor/v0.19.0 (linux/amd64) kubernetes/unknown] 10.2.0.5:47654]
I0612 03:04:13.184469 25445 handlers.go:135] GET /api/v1/namespaces/default/pods/frontend-controller-ljukf: (1.001662ms) 200 [[executor/v0.19.0 (linux/amd64) kubernetes/unknown] 10.2.0.5:47654]
I0612 03:04:13.190908 25445 handlers.go:135] PUT /api/v1/namespaces/default/pods/frontend-controller-ljukf/status: (3.695994ms) 200 [[executor/v0.19.0 (linux/amd64) kubernetes/unknown] 10.2.0.5:47654]
I0612 03:04:15.585384 25445 handlers.go:135] DELETE /api/v1/namespaces/default/pods/frontend-controller-ljukf: (29.327726ms) 200 [[km/v0.19.0 (linux/amd64) kubernetes/unknown] 10.2.0.5:48164]
scheduler.log
I0612 03:04:06.899612 30978 plugin.go:170] launching task: "pod.a737ea53-10af-11e5-b979-525400309a8f" on target "10.2.0.5" slave "20150503-133627-83886602-5050-1268-S0" for pod "default/frontend-controller-ljuk "
I0612 03:04:06.917906 30978 scheduler.go:403] task status update "TASK_STARTING" from "none" for task "pod.a737ea53-10af-11e5-b979-525400309a8f" on slave "20150503-133627-83886602-5050-1268-S0" executor "" for reason "none"
I0612 03:04:07.945855 30978 scheduler.go:403] task status update "TASK_RUNNING" from "none" for task "pod.a737ea53-10af-11e5-b979-525400309a8f" on slave "20150503-133627-83886602-5050-1268-S0" executor "" for reason "none"
I0612 03:04:07.945872 30978 registry.go:231] Received running status for pending task: pod.a737ea53-10af-11e5-b979-525400309a8f
I0612 03:04:07.945923 30978 registry.go:263] received pod status for task pod.a737ea53-10af-11e5-b979-525400309a8f: {Phase:Running Conditions:[] Message: HostIP:10.2.0.5 PodIP:172.17.2.181 StartTime:<nil> ContainerStatuses:[{Name:php-redis State:{Waiting:<nil> Running:0xc2087777e0 Terminated:<nil>} LastTerminationState:{Waiting:<nil> Running:<nil> Terminated:<nil>} Ready:false RestartCount:0 Image:jdef/php-redis ImageID:docker://0547b2a90473d3e5cc8a62c0f211ad7c691d61d91f4acaec7a91531c3126a316 ContainerID:docker://3bd77a91c6b3ce1c3dc459f3143d1549b1d5a7f213ecc6171056e8ff398d8210}]}
...
I0612 03:04:15.596979 30978 plugin.go:581] pod deleted: /pods/default/frontend-controller-ljukf
I0612 03:06:09.432669 30978 scheduler.go:403] task status update "TASK_RUNNING" from "SOURCE_MASTER" for task "pod.a737ea53-10af-11e5-b979-525400309a8f" on slave "20150503-133627-83886602-5050-1268-S0" executor "" for reason "REASON_RECONCILIATION"
I0612 03:06:09.432682 30978 registry.go:236] Ignore status TASK_RUNNING because the task pod.a737ea53-10af-11e5-b979-525400309a8f is already running
mesos-master.INFO
I0612 03:04:07.945461 28912 master.cpp:3446] Forwarding status update TASK_RUNNING (UUID: b0fdae1a-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-838866
02-5050-28892-0003
I0612 03:04:07.945518 28912 master.cpp:3418] Status update TASK_RUNNING (UUID: b0fdae1a-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-288
92-0003 from slave 20150503-133627-83886602-5050-1268-S0 at slave(1)@10.2.0.5:5051 (10.2.0.5)
I0612 03:04:07.945535 28912 master.cpp:4693] Updating the latest state of task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003 to TASK_RUNNING
I0612 03:04:07.946100 28914 master.cpp:2918] Forwarding status update acknowledgement b0fdae1a-10af-11e5-bd55-525400309a8f for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-50
50-28892-0003 (Kubernetes) at scheduler(1)@10.2.0.5:60088 to slave 20150503-133627-83886602-5050-1268-S0 at slave(1)@10.2.0.5:5051 (10.2.0.5)
...
I0612 03:04:15.602604 28914 master.cpp:2716] Asked to kill task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003
I0612 03:04:15.602625 28914 master.cpp:2814] Telling slave 20150503-133627-83886602-5050-1268-S0 at slave(1)@10.2.0.5:5051 (10.2.0.5) to kill task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003 (Kubernetes) at scheduler(1)@10.2.0.5:60088
...
I0612 03:06:09.438285 28916 master.cpp:2918] Forwarding status update acknowledgement 4c74b17e-2329-4d33-8ffc-aaf94ebea06c for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003 (Kubernetes) at scheduler(1)@10.2.0.5:60088 to slave 20150503-133627-83886602-5050-1268-S0 at slave(1)@10.2.0.5:5051 (10.2.0.5)
I0612 03:11:09.429816 28912 master.cpp:2918] Forwarding status update acknowledgement b844b51d-53fc-4154-bf79-982b1f49e4a8 for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003 (Kubernetes) at scheduler(1)@10.2.0.5:60088 to slave 20150503-133627-83886602-5050-1268-S0 at slave(1)@10.2.0.5:5051 (10.2.0.5)
mesos-slave.INFO
I0612 03:04:06.901109 1333 slave.cpp:1083] Got assigned task pod.a737ea53-10af-11e5-b979-525400309a8f for framework 20150511-114826-83886602-5050-28892-0003
I0612 03:04:06.901258 1333 slave.cpp:1193] Launching task pod.a737ea53-10af-11e5-b979-525400309a8f for framework 20150511-114826-83886602-5050-28892-0003
I0612 03:04:06.901430 1333 slave.cpp:1339] Sending task 'pod.a737ea53-10af-11e5-b979-525400309a8f' to executor '5930267bb60c8fa4_k8sm-executor' of framework 20150511-114826-83886602-5050-28892-0003
I0612 03:04:06.911550 1331 slave.cpp:2215] Handling status update TASK_STARTING (UUID: b0609477-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003 from executor(1)@10.2.0.5:60193
I0612 03:04:06.911713 1331 status_update_manager.cpp:317] Received status update TASK_STARTING (UUID: b0609477-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003
I0612 03:04:06.911814 1331 status_update_manager.hpp:346] Checkpointing UPDATE for status update TASK_STARTING (UUID: b0609477-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003
I0612 03:04:06.916769 1331 slave.cpp:2458] Forwarding the update TASK_STARTING (UUID: b0609477-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003 to master@10.2.0.5:5050
I0612 03:04:06.916946 1331 slave.cpp:2391] Sending acknowledgement for status update TASK_STARTING (UUID: b0609477-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003 to executor(1)@10.2.0.5:60193
I0612 03:04:06.918567 1327 status_update_manager.cpp:389] Received status update acknowledgement (UUID: b0609477-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003
I0612 03:04:06.920018 1327 status_update_manager.hpp:346] Checkpointing ACK for status update TASK_STARTING (UUID: b0609477-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003
I0612 03:04:07.940975 1328 slave.cpp:2215] Handling status update TASK_RUNNING (UUID: b0fdae1a-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003 from executor(1)@10.2.0.5:60193
I0612 03:04:07.941081 1328 status_update_manager.cpp:317] Received status update TASK_RUNNING (UUID: b0fdae1a-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003
I0612 03:04:07.941097 1328 status_update_manager.hpp:346] Checkpointing UPDATE for status update TASK_RUNNING (UUID: b0fdae1a-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003
I0612 03:04:07.945081 1328 slave.cpp:2458] Forwarding the update TASK_RUNNING (UUID: b0fdae1a-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003 to master@10.2.0.5:5050
I0612 03:04:07.945163 1328 slave.cpp:2391] Sending acknowledgement for status update TASK_RUNNING (UUID: b0fdae1a-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003 to executor(1)@10.2.0.5:60193
I0612 03:04:07.946331 1328 status_update_manager.cpp:389] Received status update acknowledgement (UUID: b0fdae1a-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003
I0612 03:04:07.946405 1328 status_update_manager.hpp:346] Checkpointing ACK for status update TASK_RUNNING (UUID: b0fdae1a-10af-11e5-bd55-525400309a8f) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003
...
I0612 03:04:15.603466 1329 slave.cpp:1372] Asked to kill task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003
...
I0612 03:06:09.438762 1330 status_update_manager.cpp:389] Received status update acknowledgement (UUID: 4c74b17e-2329-4d33-8ffc-aaf94ebea06c) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003
E0612 03:06:09.438796 1330 slave.cpp:1793] Failed to handle status update acknowledgement (UUID: 4c74b17e-2329-4d33-8ffc-aaf94ebea06c) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003: Unexpected status update acknowledgment (UUID: 4c74b17e-2329-4d33-8ffc-aaf94ebea06c) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003
executor::stderr
I0612 03:04:06.901973 31016 executor.go:311] Executor asked to run task '&TaskID{Value:*pod.a737ea53-10af-11e5-b979-525400309a8f,XXX_unrecognized:[],}'
I0612 03:04:06.901985 31016 executor.go:255] Launch task &TaskInfo{Name:*frontend-controller-ljukf.default.pods,TaskId:&TaskID{Value:*pod.a737ea53-10af-11e5-b979-525400309a8f,XXX_unrecognized:[],}...
I0612 03:04:06.905872 31016 executor.go:426] Binding 'default/frontend-controller-ljukf' to '10.2.0.5' with annotations map[kubernetes.io/created-by:...
I0612 03:04:06.910977 31016 executor.go:526] Executor sending status update framework_id:<value:"20150511-114826-83886602-5050-28892-0003" > executor_id:<value:"5930267bb60c8fa4_k8sm-executor" > slave_id:<value:"20150503-133627-83886602-5050-1268-S0" > status:<task_id:<value:"pod.a737ea53-10af-11e5-b979-525400309a8f" > state:TASK_STARTING message:"create-binding-success" data:"{\"metadata\":{\"name\":\"frontend-controller-ljukf_default\",\"selfLink\":\"/podstatusresult\",\"creationTimestamp\":null},\"status\":{}}" slave_id:<value:"20150503-133627-83886602-5050-1268-S0" > timestamp:1.434078246e+09 > timestamp:1.434078246e+09 uuid:"\260`\224w\020\257\021\345\275URT\0000\232\217"
...
I0612 03:04:06.917602 31016 executor.go:335] Receiving status update acknowledgement slave_id:<value:"20150503-133627-83886602-5050-1268-S0" > framework_id:<value:"20150511-114826-83886602-5050-28892-0003" > task_id:<value:"pod.a737ea53-10af-11e5-b979-525400309a8f" > uuid:"\260`\224w\020\257\021\345\275URT\0000\232\217"
...
I0612 03:04:07.940515 31016 executor.go:526] Executor sending status update framework_id:<value:"20150511-114826-83886602-5050-28892-0003" > executor_id:<value:"5930267bb60c8fa4_k8sm-executor" > slave_id:<value:"20150503-133627-83886602-5050-1268-S0" > status:<task_id:<value:"pod.a737ea53-10af-11e5-b979-525400309a8f" > state:TASK_RUNNING message:"pod-running:frontend-controller-ljukf_default" data:"{\"metadata\":{\"name\":\"frontend-controller-ljukf_default\",\"selfLink\":\"/podstatusresult\",\"creationTimestamp\":null},\"status\":{\"phase\":\"Running\",\"hostIP\":\"10.2.0.5\",\"podIP\":\"172.17.2.181\",\"containerStatuses\":[{\"name\":\"php-redis\",\"state\":{\"running\":{\"startedAt\":\"2015-06-12T03:04:07Z\"}},\"lastState\":{},\"ready\":false,\"restartCount\":0,\"image\":\"jdef/php-redis\",\"imageID\":\"docker://0547b2a90473d3e5cc8a62c0f211ad7c691d61d91f4acaec7a91531c3126a316\",\"containerID\":\"docker://3bd77a91c6b3ce1c3dc459f3143d1549b1d5a7f213ecc6171056e8ff398d8210\"}]}}" slave_id:<value:"20150503-133627-83886602-5050-1268-S0" > timestamp:1.434078247e+09 > timestamp:1.434078247e+09 uuid:"\260\375\256\032\020\257\021\345\275URT\0000\232\217"
...
I0612 03:04:07.946033 31016 executor.go:335] Receiving status update acknowledgement slave_id:<value:"20150503-133627-83886602-5050-1268-S0" > framework_id:<value:"20150511-114826-83886602-5050-28892-0003" > task_id:<value:"pod.a737ea53-10af-11e5-b979-525400309a8f" > uuid:"\260\375\256\032\020\257\021\345\275URT\0000\232\217"
$ dpkg -l |grep -e mesos
ii mesos 0.21.1-1.1.ubuntu1404 amd64 Cluster resource manager with efficient resource isolation
$ uname -a
Linux node-1 3.13.0-29-generic #53-Ubuntu SMP Wed Jun 4 21:00:20 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
and reconciliation isn't cleaning up the mess either (probably because the tasks have been flagged as Deleted
so perhaps the reconciliation mechanism assumes that a kill is in progress):
$ curl http://$servicehost:10251/debug/registry/tasks
task_count=13
pod.a417db84-10f4-11e5-b979-525400309a8f default/frontend-controller-2v9yd 1 20150511-114826-83886602-5050-28892-O1620812
pod.8e426537-1082-11e5-b979-525400309a8f default/redis-master-2 1 20150511-114826-83886602-5050-28892-O1578575
pod.7dedd0f4-10f4-11e5-b979-525400309a8f default/frontend-controller-ff5gi 1 20150511-114826-83886602-5050-28892-O1620742
pod.eb285a3c-10f4-11e5-b979-525400309a8f default/frontend-controller-mm3zq 1 20150511-114826-83886602-5050-28892-O1620902
pod.438e06c8-10dc-11e5-b979-525400309a8f default/frontend-controller-et50j 1 20150511-114826-83886602-5050-28892-O1611845
pod.9ef6bffd-10e3-11e5-b979-525400309a8f default/frontend-controller-uhpfa 1 20150511-114826-83886602-5050-28892-O1614589
pod.eb48ada7-10f4-11e5-b979-525400309a8f default/frontend-controller-7txxl 1 20150511-114826-83886602-5050-28892-O1620906
pod.eb05a79a-10f4-11e5-b979-525400309a8f default/frontend-controller-iidy6 1 20150511-114826-83886602-5050-28892-O1620900
pod.ea4a79b1-10af-11e5-b979-525400309a8f default/frontend-controller-9qo7q 1 20150511-114826-83886602-5050-28892-O1595246
pod.dbf27621-1084-11e5-b979-525400309a8f default/redis-slave-controller-xrkeh 1 20150511-114826-83886602-5050-28892-O1579370
pod.dc01cf2d-1084-11e5-b979-525400309a8f default/redis-slave-controller-srlc6 1 20150511-114826-83886602-5050-28892-O1579373
pod.ab62f48f-10db-11e5-b979-525400309a8f default/frontend-controller-ol0mr 1 20150511-114826-83886602-5050-28892-O1611626
pod.a737ea53-10af-11e5-b979-525400309a8f default/frontend-controller-ljukf 1 20150511-114826-83886602-5050-28892-O1595155
$ kc get pods
NAME READY REASON RESTARTS AGE
frontend-controller-2v9yd 1/1 Running 0 1h
frontend-controller-7txxl 1/1 Running 0 1h
frontend-controller-ff5gi 1/1 Running 0 1h
frontend-controller-iidy6 1/1 Running 0 1h
frontend-controller-mm3zq 1/1 Running 0 1h
redis-master-2 1/1 Running 0 15h
redis-slave-controller-srlc6 1/1 Running 0 14h
redis-slave-controller-xrkeh 1/1 Running 0 14h
mesos-slave.WARN within 2m of the "task kill" event:
W0612 03:03:33.358157 1328 status_update_manager.cpp:472] Resending status update TASK_LOST (UUID: 86e13b65-1034-417a-86d2-f5c323e42c56) for task pod.a8ba5838-0471-11e5-96e5-525400309a8f of framework 20150511-114
826-83886602-5050-28892-0001
W0612 03:03:33.358258 1328 status_update_manager.cpp:472] Resending status update TASK_LOST (UUID: 75493d76-d977-4c24-9f0d-76fd842b8bf7) for task pod.4ee9bb25-046f-11e5-96e5-525400309a8f of framework 20150511-114
826-83886602-5050-28892-0001
E0612 03:06:09.433902 1329 slave.cpp:1793] Failed to handle status update acknowledgement (UUID: f2dddbfc-7ee2-4cd1-9aa7-7aacb02d8703) for task pod.ea59ce21-10af-11e5-b979-525400309a8f of framework 20150511-11482
6-83886602-5050-28892-0003: Unexpected status update acknowledgment (UUID: f2dddbfc-7ee2-4cd1-9aa7-7aacb02d8703) for task pod.ea59ce21-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0
003
E0612 03:06:09.434703 1329 slave.cpp:1793] Failed to handle status update acknowledgement (UUID: 0227fb12-2228-4443-ac44-43b979962bd7) for task pod.ea2bcdfe-10af-11e5-b979-525400309a8f of framework 20150511-11482
6-83886602-5050-28892-0003: Unexpected status update acknowledgment (UUID: 0227fb12-2228-4443-ac44-43b979962bd7) for task pod.ea2bcdfe-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0
003
E0612 03:06:09.435458 1333 slave.cpp:1793] Failed to handle status update acknowledgement (UUID: d165100a-27f2-4cc1-85f4-b6db8147192b) for task pod.ea787762-10af-11e5-b979-525400309a8f of framework 20150511-11482
6-83886602-5050-28892-0003: Unexpected status update acknowledgment (UUID: d165100a-27f2-4cc1-85f4-b6db8147192b) for task pod.ea787762-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0
003
E0612 03:06:09.435912 1333 slave.cpp:1793] Failed to handle status update acknowledgement (UUID: 294b5c93-7775-43e8-9243-e52e57cdc65b) for task pod.e01774c5-10af-11e5-b979-525400309a8f of framework 20150511-11482
6-83886602-5050-28892-0003: Unexpected status update acknowledgment (UUID: 294b5c93-7775-43e8-9243-e52e57cdc65b) for task pod.e01774c5-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0
003
E0612 03:06:09.438344 1330 slave.cpp:1793] Failed to handle status update acknowledgement (UUID: 98c6e07a-abc3-4606-b3af-19c929afc1d1) for task pod.e0361b37-10af-11e5-b979-525400309a8f of framework 20150511-11482
6-83886602-5050-28892-0003: Unexpected status update acknowledgment (UUID: 98c6e07a-abc3-4606-b3af-19c929afc1d1) for task pod.e0361b37-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0
003
E0612 03:06:09.438796 1330 slave.cpp:1793] Failed to handle status update acknowledgement (UUID: 4c74b17e-2329-4d33-8ffc-aaf94ebea06c) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003: Unexpected status update acknowledgment (UUID: 4c74b17e-2329-4d33-8ffc-aaf94ebea06c) for task pod.a737ea53-10af-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003
E0612 03:06:09.439334 1330 slave.cpp:1793] Failed to handle status update acknowledgement (UUID: 699384df-7d99-4fa3-9707-2f65c837bef6) for task pod.c7ee4b4a-10ae-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003: Unexpected status update acknowledgment (UUID: 699384df-7d99-4fa3-9707-2f65c837bef6) for task pod.c7ee4b4a-10ae-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003
E0612 03:06:09.439699 1330 slave.cpp:1793] Failed to handle status update acknowledgement (UUID: 33df5347-4541-47d0-a152-6b42962dde89) for task pod.8e426537-1082-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003: Unexpected status update acknowledgment (UUID: 33df5347-4541-47d0-a152-6b42962dde89) for task pod.8e426537-1082-11e5-b979-525400309a8f of framework 20150511-114826-83886602-5050-28892-0003
running the resizingFrontend.sh script on dcos for 2 days results in 48 pod tasks running in mesos on dcos, but
kc get pods
only returns 16.EDIT: I've also reproduced this locally, on my laptop. It has nothing to do with DCOS.
Originally produced on DCOS (running the latest mesos (v0.22.x)). Reproduced on my laptop v0.21.1 -- that's where these logs are from.