mesosphere / marathon

Deploy and manage containers (including Docker) on top of Apache Mesos at scale.
https://mesosphere.github.io/marathon/
Apache License 2.0
4.07k stars 845 forks source link

Marathon 0.15.0-1.0.457.ubuntu1404 forces redeploy of app all the time #3139

Closed scalp42 closed 8 years ago

scalp42 commented 8 years ago

Scenario:

On Marathon 0.14.1-1.0.455.ubuntu1404, when you deploy/update an app without any changes in the JSON, no deployment happen:

  1. curl -XPUT -i -H 'Content-Type: application/json' -d @test.json marathon.example.com:8080/v2/apps/test
  2. deployment happen
  3. update app again with no changes in JSON curl -XPUT -i -H 'Content-Type: application/json' -d @test.json marathon.example.com:8080/v2/apps/test
  4. no deployment happen

But when using Marathon 0.15.0-1.0.457.ubuntu1404, deployment happen all the time:

  1. curl -XPUT -i -H 'Content-Type: application/json' -d @test.json marathon.example.com:8080/v2/apps/test
  2. deployment happen
  3. update app again with no changes in JSON curl -XPUT -i -H 'Content-Type: application/json' -d @test.json marathon.example.com:8080/v2/apps/test
  4. deployment happen

We reverted back to 0.14.1 in meantime and it's now working as intended.

scalp42 commented 8 years ago

Marathon 0.14.1 log when updating a single app called audit-green with no changes:

root@mesos-master-i-635669ba [gz-prod] ~ # tail -f /var/log/marathon.log | grep -v GET | grep -v Healthy

Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,388] INFO Upgrade group id:/prod/talkk version:2016-02-02T19:05:28.387Z with force:false (mesosphere.marathon.state.GroupManager:marathon-akka.actor.default-dispatcher-4)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,451] INFO Computed new deployment plan:
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: DeploymentPlan 2016-02-02T19:05:28.387Z NO STEPS
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]:  (mesosphere.marathon.state.GroupManager:pool-1-thread-56)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,457] INFO Deploy plan with force=false:
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: DeploymentPlan 2016-02-02T19:05:28.387Z NO STEPS
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]:   (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$3cb25dd8:pool-1-thread-68)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,733] INFO Updated groups/apps according to deployment plan 24871d91-09c5-4d8f-9160-04bd183b5ef2 (mesosphere.marathon.state.GroupManager:pool-1-thread-65)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,734] INFO 10.92.42.156 - - [02/Feb/2016:19:05:28 +0000] "PUT //marathon.example.com:8080/v2/apps/prod/talkk/audit-green HTTP/1.1" 200 92 "-" "curl/7.43.0" 36 (mesosphere.chaos.http.ChaosRequestLog$$EnhancerByGuice$$f666ac6a:qtp2052601222-29)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,734] INFO POSTing to all endpoints. (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-10)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,734] INFO Deployment acknowledged. Waiting to get processed:
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: DeploymentPlan 2016-02-02T19:05:28.387Z NO STEPS
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]:  (mesosphere.marathon.state.GroupManager:pool-1-thread-65)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,735] INFO POSTing to all endpoints. (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-12)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,758] INFO Sending POST to:http://10.92.22.129:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-4)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,758] INFO Sending POST to:http://10.92.22.183:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-4)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,761] INFO Sending POST to:http://10.92.32.182:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-10)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,765] INFO Sending POST to:http://10.92.22.184:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-10)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,765] INFO Sending POST to:http://10.92.42.48:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-10)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,770] INFO Sending POST to:http://10.92.42.47:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-4)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,770] INFO Sending POST to:http://10.92.32.183:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-4)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,779] INFO Sending POST to:http://10.92.22.129:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-5)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,779] INFO Sending POST to:http://10.92.22.183:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-5)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,780] INFO Sending POST to:http://10.92.32.182:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-5)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,780] INFO Sending POST to:http://10.92.22.184:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-5)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,781] INFO Sending POST to:http://10.92.42.48:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-4)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,781] INFO Sending POST to:http://10.92.42.47:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-4)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,781] INFO Sending POST to:http://10.92.32.183:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-4)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,829] INFO Deployment of / successful (mesosphere.marathon.MarathonSchedulerActor:marathon-akka.actor.default-dispatcher-9)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,829] INFO Removing 24871d91-09c5-4d8f-9160-04bd183b5ef2 from list of running deployments (mesosphere.marathon.upgrade.DeploymentManager:marathon-akka.actor.default-dispatcher-9)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,829] INFO POSTing to all endpoints. (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-9)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,864] INFO Sending POST to:http://10.92.22.129:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-9)
Feb  2 19:05:28 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:28,953] INFO Sending POST to:http://10.92.22.183:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-5)
Feb  2 19:05:29 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:29,053] INFO Sending POST to:http://10.92.32.182:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-13)
Feb  2 19:05:29 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:29,197] INFO Sending POST to:http://10.92.22.184:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-5)
Feb  2 19:05:29 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:29,282] INFO Sending POST to:http://10.92.42.48:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-5)
Feb  2 19:05:29 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:29,363] INFO Sending POST to:http://10.92.42.47:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-11)
Feb  2 19:05:29 mesos-master-i-6a51a0ad marathon[6680]: [2016-02-02 19:05:29,439] INFO Sending POST to:http://10.92.32.183:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-12)
scalp42 commented 8 years ago

Marathon 0.15.0 log when updating that same app no changes:

root@mesos-master-i-635669ba [gz-prod] ~ # tail -f /var/log/marathon.log | grep -v GET | grep -v Healthy

Feb  2 19:10:07 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:07,811] INFO Upgrade group id:/prod/talkk version:2016-02-02T19:10:07.810Z with force:false (mesosphere.marathon.state.GroupManager:marathon-akka.actor.default-dispatcher-5)
Feb  2 19:10:07 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:07,840] INFO [/prod/talkk/audit-green]: restart detected for app (oldVersion FullVersionInfo(2016-02-02T18:50:53.002Z,2016-02-02T18:50:53.002Z,2016-01-26T21:57:45.288Z)) (mesosphere.marathon.upgrade.GroupVersioningUtil$:ForkJoinPool-2-worker-7)
Feb  2 19:10:07 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:07,893] INFO Computed new deployment plan:
Feb  2 19:10:07 mesos-master-i-635669ba marathon[24385]: DeploymentPlan 2016-02-02T19:10:07.810Z
Feb  2 19:10:07 mesos-master-i-635669ba marathon[24385]: step 1:
Feb  2 19:10:07 mesos-master-i-635669ba marathon[24385]:   * Restart(App(/prod/talkk/audit-green, image="java:8", cmd="JAVA_OPTS="-javaagent:${MESOS_SANDBOX}/talkk-audit-1.0.1/lib/org.aspectj.aspectjweaver-1.8.7.jar -Xmx1g -Dcom.sun.management.jmxremote.rmi.port=8081 -Dcom.sun.management.jmxremote -Djava.rmi.server.hostname=$HOST -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8081 -Dcom.sun.management.jmxremote.authenticate=false" bash ${MESOS_SANDBOX}/talkk-audit-1.0.1/bin/talkk-audit")))
Feb  2 19:10:07 mesos-master-i-635669ba marathon[24385]:  (mesosphere.marathon.state.GroupManager:ForkJoinPool-2-worker-7)
Feb  2 19:10:07 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:07,894] INFO Deploy plan with force=false:
Feb  2 19:10:07 mesos-master-i-635669ba marathon[24385]: DeploymentPlan 2016-02-02T19:10:07.810Z
Feb  2 19:10:07 mesos-master-i-635669ba marathon[24385]: step 1:
Feb  2 19:10:07 mesos-master-i-635669ba marathon[24385]:   * Restart(App(/prod/talkk/audit-green, image="java:8", cmd="JAVA_OPTS="-javaagent:${MESOS_SANDBOX}/talkk-audit-1.0.1/lib/org.aspectj.aspectjweaver-1.8.7.jar -Xmx1g -Dcom.sun.management.jmxremote.rmi.port=8081 -Dcom.sun.management.jmxremote -Djava.rmi.server.hostname=$HOST -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8081 -Dcom.sun.management.jmxremote.authenticate=false" bash ${MESOS_SANDBOX}/talkk-audit-1.0.1/bin/talkk-audit")))
Feb  2 19:10:07 mesos-master-i-635669ba marathon[24385]:   (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$9e356537:ForkJoinPool-2-worker-7)
Feb  2 19:10:07 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:07,900] INFO [/prod/talkk/audit-green] storing new app version 2016-02-02T19:10:07.810Z (mesosphere.marathon.state.GroupManager:ForkJoinPool-2-worker-49)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,215] INFO Updated groups/apps according to deployment plan b8f913e7-0713-4653-8cbb-fe66f9fbff3e (mesosphere.marathon.state.GroupManager:ForkJoinPool-2-worker-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,215] INFO Deployment acknowledged. Waiting to get processed:
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: DeploymentPlan 2016-02-02T19:10:07.810Z
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: step 1:
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]:   * Restart(App(/prod/talkk/audit-green, image="java:8", cmd="JAVA_OPTS="-javaagent:${MESOS_SANDBOX}/talkk-audit-1.0.1/lib/org.aspectj.aspectjweaver-1.8.7.jar -Xmx1g -Dcom.sun.management.jmxremote.rmi.port=8081 -Dcom.sun.management.jmxremote -Djava.rmi.server.hostname=$HOST -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8081 -Dcom.sun.management.jmxremote.authenticate=false" bash ${MESOS_SANDBOX}/talkk-audit-1.0.1/bin/talkk-audit")))
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]:  (mesosphere.marathon.state.GroupManager:ForkJoinPool-2-worker-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,216] INFO POSTing to all endpoints. (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-8)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,217] INFO Sending POST to:http://10.92.22.129:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-10)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,219] INFO 10.92.42.156 - - [02/Feb/2016:19:10:07 +0000] "PUT //marathon.example.com:8080/v2/apps/prod/talkk/audit-green HTTP/1.1" 200 92 "-" "curl/7.43.0" 47 (mesosphere.chaos.http.ChaosRequestLog$$EnhancerByGuice$$ef716d44:qtp1361931872-31)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,220] INFO Sending POST to:http://10.92.22.183:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,221] INFO Sending POST to:http://10.92.32.182:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,221] INFO Sending POST to:http://10.92.22.184:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,221] INFO Sending POST to:http://10.92.42.48:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,221] INFO Sending POST to:http://10.92.42.47:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,221] INFO Sending POST to:http://10.92.32.183:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,221] INFO POSTing to all endpoints. (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,221] INFO Sending POST to:http://10.92.22.129:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,221] INFO Sending POST to:http://10.92.22.183:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,221] INFO Sending POST to:http://10.92.32.182:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,221] INFO Sending POST to:http://10.92.22.184:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-4)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,222] INFO Sending POST to:http://10.92.42.48:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-6)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,227] INFO Sending POST to:http://10.92.42.47:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-6)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,230] INFO Sending POST to:http://10.92.32.183:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-5)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,317] INFO POSTing to all endpoints. (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-8)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,317] INFO Adding health check for app [/prod/talkk/audit-green] and version [2016-02-02T19:10:07.810Z]: [HealthCheck(Some(/ops/ping),HTTP,Some(0),None,30 seconds,2 seconds,5 seconds,3,false,None)] (mesosphere.marathon.health.MarathonHealthCheckManager$$EnhancerByGuice$$f9adff8e:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,318] INFO Sending POST to:http://10.92.22.129:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-6)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,319] INFO Starting health check actor for app [/prod/talkk/audit-green] version [2016-02-02T19:10:07.810Z] and healthCheck [HealthCheck(Some(/ops/ping),HTTP,Some(0),None,30 seconds,2 seconds,5 seconds,3,false,None)] (mesosphere.marathon.health.HealthCheckActor:marathon-akka.actor.default-dispatcher-6)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,319] INFO POSTing to all endpoints. (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-10)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,326] INFO Sending POST to:http://10.92.22.129:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,326] INFO Sending POST to:http://10.92.22.183:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,326] INFO Sending POST to:http://10.92.32.182:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,326] INFO Sending POST to:http://10.92.22.184:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,326] INFO Sending POST to:http://10.92.42.48:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,327] INFO Sending POST to:http://10.92.42.47:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-5)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,327] INFO Sending POST to:http://10.92.32.183:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-5)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,337] INFO For minimumHealthCapacity 1.0 of /prod/talkk/audit-green leave 2 tasks running, maximum capacity 3, killing 0 tasks immediately (mesosphere.marathon.upgrade.TaskReplaceActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,337] INFO Reconciling tasks during app /prod/talkk/audit-green restart: queuing 1 new tasks (mesosphere.marathon.upgrade.TaskReplaceActor:marathon-akka.actor.default-dispatcher-7)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,341] INFO Started appTaskLaunchActor for /prod/talkk/audit-green version 2016-02-02T19:10:07.810Z with initial count 1 (mesosphere.marathon.core.launchqueue.impl.AppTaskLauncherActor:marathon-akka.actor.default-dispatcher-4)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,348] INFO activating matcher ActorOfferMatcher(Actor[akka://marathon/user/launchQueue/1/0-prod_talkk_audit-green#-1653409606]). (mesosphere.marathon.core.matcher.manager.impl.OfferMatcherManagerActor:marathon-akka.actor.default-dispatcher-8)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,350] INFO Received offers WANTED notification (mesosphere.marathon.core.flow.impl.ReviveOffersActor:marathon-akka.actor.default-dispatcher-8)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,350] INFO => revive offers NOW, canceling any scheduled revives (mesosphere.marathon.core.flow.impl.ReviveOffersActor:marathon-akka.actor.default-dispatcher-8)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,351] INFO 2 further revives still needed. Repeating reviveOffers according to --revive_offers_repetitions 3 (mesosphere.marathon.core.flow.impl.ReviveOffersActor:marathon-akka.actor.default-dispatcher-8)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,351] INFO => Schedule next revive at 2016-02-02T19:10:13.349Z in 4999 milliseconds, adhering to --min_revive_offers_interval 5000 (ms) (mesosphere.marathon.core.flow.impl.ReviveOffersActor:marathon-akka.actor.default-dispatcher-8)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,351] INFO Resetting the backoff delay before restarting the app (mesosphere.marathon.upgrade.TaskReplaceActor:marathon-akka.actor.default-dispatcher-8)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,405] INFO Offer [0c3e5f88-f832-4b88-9029-2da8b834265f-O1082147]. Constraints for app [/prod/talkk/audit-green] not satisfied.
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: The conflicting constraints are: [field: "az"
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: operator: GROUP_BY
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: value: "3"
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: ] (mesosphere.mesos.ResourceMatcher$:marathon-akka.actor.default-dispatcher-6)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,424] INFO Offer [0c3e5f88-f832-4b88-9029-2da8b834265f-O1082147]. Insufficient resources for [/prod/talkk/audit-green] (need cpus=0.1, mem=1536.0, disk=0.0, ports=(2 dynamic), available in offer: [id { value: "0c3e5f88-f832-4b88-9029-2da8b834265f-O1082147" } framework_id { value: "8ace1cd7-5a79-40f6-99cd-62c87ce2ef49-0001" } slave_id { value: "8ace1cd7-5a79-40f6-99cd-62c87ce2ef49-S1" } hostname: "mesos-slave-i-f4c1982d.example.com" resources { name: "cpu" type: SCALAR scalar { value: 2.0 } role: "*" } resources { name: "mem" type: SCALAR scalar { value: 7983.0 } role: "*" } resources { name: "ports" type: RANGES ranges { range { begin: 31000 end: 32000 } } role: "*" } resources { name: "cpus" type: SCALAR scalar { value: 2.0 } role: "*" } resources { name: "disk" type: SCALAR scalar { value: 498718.0 } role: "*" } attributes { name: "az" type: TEXT text { value: "us-west-2a" } } url { scheme: "http" address { hostname: "mesos-slave-i-f4c1982d.example
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: .com" ip: "10.92.22.129" port: 5051 } path: "/slave(1)" }] (mesosphere.mesos.TaskBuilder:marathon-akka.actor.default-dispatcher-6)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,439] INFO Finished processing 0c3e5f88-f832-4b88-9029-2da8b834265f-O1082147. Matched 0 tasks after 1 passes. cpu 2.0; mem 7983.0; ports 31000->32000; cpus 2.0; disk 498718.0 left. (mesosphere.marathon.core.matcher.manager.impl.OfferMatcherManagerActor:marathon-akka.actor.default-dispatcher-4)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,480] INFO Sending POST to:http://10.92.22.183:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-2)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,489] INFO No tasks left to launch. Stop receiving offers for /prod/talkk/audit-green, 2016-02-02T19:10:07.810Z (mesosphere.marathon.core.launchqueue.impl.AppTaskLauncherActor:marathon-akka.actor.default-dispatcher-2)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,490] INFO removing matcher ActorOfferMatcher(Actor[akka://marathon/user/launchQueue/1/0-prod_talkk_audit-green#-1653409606]) (mesosphere.marathon.core.matcher.manager.impl.OfferMatcherManagerActor:marathon-akka.actor.default-dispatcher-2)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,490] INFO Request to launch task with id 'prod_talkk_audit-green.933799f1-c9e0-11e5-b477-0242ba9822b9', version '2016-02-02T19:10:07.810Z'. 0 tasksToLaunch, 1 in flight, 2 confirmed. instance count delta 1. not backing off (mesosphere.marathon.core.launchqueue.impl.AppTaskLauncherActor:marathon-akka.actor.default-dispatcher-2)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,490] INFO Received offers NOT WANTED notification, canceling 2 revives (mesosphere.marathon.core.flow.impl.ReviveOffersActor:marathon-akka.actor.default-dispatcher-2)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,494] INFO Finished processing 0c3e5f88-f832-4b88-9029-2da8b834265f-O1082149. Matched 0 tasks after 1 passes. cpu 2.0; mem 303.0; ports 31000->31165,31167->31200,31202->31227,31230->31509,31511->31950,31952->32000; cpus 1.5; disk 498718.0 left. (mesosphere.marathon.core.matcher.manager.impl.OfferMatcherManagerActor:marathon-akka.actor.default-dispatcher-8)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,495] INFO Finished processing 0c3e5f88-f832-4b88-9029-2da8b834265f-O1082150. Matched 0 tasks after 1 passes. cpu 2.0; mem 3375.0; ports 31000->31691,31693->31741,31743->31775,31777->32000; cpus 1.7000000000000002; disk 498718.0 left. (mesosphere.marathon.core.matcher.manager.impl.OfferMatcherManagerActor:marathon-akka.actor.default-dispatcher-8)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,496] INFO Finished processing 0c3e5f88-f832-4b88-9029-2da8b834265f-O1082151. Matched 0 tasks after 1 passes. cpu 2.0; mem 1711.0; ports 31000->31059,31061->31394,31397->31435,31437->31613,31615->31996,31998->32000; cpus 1.4999999999999996; disk 498718.0 left. (mesosphere.marathon.core.matcher.manager.impl.OfferMatcherManagerActor:marathon-akka.actor.default-dispatcher-8)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,496] INFO Finished processing 0c3e5f88-f832-4b88-9029-2da8b834265f-O1082152. Matched 0 tasks after 1 passes. cpu 2.0; mem 3375.0; ports 31000->31272,31274->31313,31315->31384,31386->32000; cpus 1.6999999999999997; disk 498718.0 left. (mesosphere.marathon.core.matcher.manager.impl.OfferMatcherManagerActor:marathon-akka.actor.default-dispatcher-8)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,497] INFO Finished processing 0c3e5f88-f832-4b88-9029-2da8b834265f-O1082148. Matched 1 tasks after 2 passes. cpu 2.0; mem 271.0; ports 31000->31003,31005->31014,31016->31456,31458->31493,31496->31841,31843->31999; cpus 1.4; disk 498718.0 left. (mesosphere.marathon.core.matcher.manager.impl.OfferMatcherManagerActor:marathon-akka.actor.default-dispatcher-8)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,497] INFO Save task [prod_talkk_audit-green.933799f1-c9e0-11e5-b477-0242ba9822b9] (mesosphere.marathon.core.launcher.impl.OfferProcessorImpl:ForkJoinPool-2-worker-11)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,510] INFO Task launch for 'prod_talkk_audit-green.933799f1-c9e0-11e5-b477-0242ba9822b9' was accepted. 0 tasksToLaunch, 0 in flight, 3 confirmed. instance count delta 1. not backing off (mesosphere.marathon.core.launchqueue.impl.AppTaskLauncherActor:marathon-akka.actor.default-dispatcher-8)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,538] INFO Sending POST to:http://10.92.32.182:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-5)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,584] INFO Sending POST to:http://10.92.22.184:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-6)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,636] INFO Sending POST to:http://10.92.42.48:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-10)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,678] INFO Sending POST to:http://10.92.42.47:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-10)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,712] INFO Sending POST to:http://10.92.32.183:8888 (mesosphere.marathon.event.http.HttpEventActor:marathon-akka.actor.default-dispatcher-10)
Feb  2 19:10:11 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:11,709] INFO Received status update for task prod_talkk_audit-green.933799f1-c9e0-11e5-b477-0242ba9822b9: TASK_RUNNING () (mesosphere.marathon.MarathonScheduler$$EnhancerByGuice$$c6c05dfe:Thread-78)
meichstedt commented 8 years ago

Hey @scalp42 thanks for reporting this and the provided logs!

We introduced some additional fields in the AppConfig which probably lead to the unintentional non-equality.

scalp42 commented 8 years ago

This part is also weird because nothing has changed.

Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: The conflicting constraints are: [field: "az"
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: operator: GROUP_BY
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: value: "3"
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: ]
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: ] (mesosphere.mesos.ResourceMatcher$:marathon-akka.actor.default-dispatcher-6)
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: [2016-02-02 19:10:08,424] INFO Offer [0c3e5f88-f832-4b88-9029-2da8b834265f-O1082147]. Insufficient resources for [/prod/talkk/audit-green] (need cpus=0.1, mem=1536.0, disk=0.0, ports=(2 dynamic), available in offer: [id { value: "0c3e5f88-f832-4b88-9029-2da8b834265f-O1082147" } framework_id { value: "8ace1cd7-5a79-40f6-99cd-62c87ce2ef49-0001" } slave_id { value: "8ace1cd7-5a79-40f6-99cd-62c87ce2ef49-S1" } hostname: "mesos-slave-i-f4c1982d.example.com" resources { name: "cpu" type: SCALAR scalar { value: 2.0 } role: "*" } resources { name: "mem" type: SCALAR scalar { value: 7983.0 } role: "*" } resources { name: "ports" type: RANGES ranges { range { begin: 31000 end: 32000 } } role: "*" } resources { name: "cpus" type: SCALAR scalar { value: 2.0 } role: "*" } resources { name: "disk" type: SCALAR scalar { value: 498718.0 } role: "*" } attributes { name: "az" type: TEXT text { value: "us-west-2a" } } url { scheme: "http" address { hostname: "mesos-slave-i-f4c1982d.example
Feb  2 19:10:08 mesos-master-i-635669ba marathon[24385]: .com" ip: "10.92.22.129" port: 5051 } path: "/slave(1)" }] (mesosphere.mesos.TaskBuilder:marathon-akka.actor.default-dispatcher-6)
deric commented 8 years ago

Same issue here, I've upgraded Mesos to 0.27 first, then Marathon. After few restarts of Marathon leader, there were more instances than needed for one app, all others were stuck in deploy phase. Marathon wasn't able to kill the old one. I've stopped all Marathons, deleted rmr /marathon in ZooKeeper. Resubmitted all apps, no deploy happened. Reverted to 0.14.1, all apps were swiftly deployed. OS: Debian Jessie

kolloch commented 8 years ago

I found the problem in ee569cb252b8fad9cd7aa77d5275d3e7532e1a77.

The version is set to NoVersion if the AppUpdate does not contain a version. Instead it needs to remain unchanged.

  def apply(app: AppDefinition): AppDefinition = app.copy(
    id = app.id,
    cmd = cmd.orElse(app.cmd),
    args = args.orElse(app.args),
    user = user.orElse(app.user),
    env = env.getOrElse(app.env),
    instances = instances.getOrElse(app.instances),
    cpus = cpus.getOrElse(app.cpus),
    mem = mem.getOrElse(app.mem),
    disk = disk.getOrElse(app.disk),
    executor = executor.getOrElse(app.executor),
    constraints = constraints.getOrElse(app.constraints),
    fetch = fetch.getOrElse(app.fetch),
    storeUrls = storeUrls.getOrElse(app.storeUrls),
    ports = ports.getOrElse(app.ports),
    requirePorts = requirePorts.getOrElse(app.requirePorts),
    backoff = backoff.getOrElse(app.backoff),
    backoffFactor = backoffFactor.getOrElse(app.backoffFactor),
    maxLaunchDelay = maxLaunchDelay.getOrElse(app.maxLaunchDelay),
    container = container.filterNot(_ == Container.Empty).orElse(app.container),
    healthChecks = healthChecks.getOrElse(app.healthChecks),
    dependencies = dependencies.map(_.map(_.canonicalPath(app.id))).getOrElse(app.dependencies),
    upgradeStrategy = upgradeStrategy.getOrElse(app.upgradeStrategy),
    labels = labels.getOrElse(app.labels),
    acceptedResourceRoles = acceptedResourceRoles.orElse(app.acceptedResourceRoles),
    ipAddress = ipAddress.orElse(app.ipAddress),
    versionInfo = version.map(OnlyVersion).getOrElse(NoVersion)
  )

I will write tests and a fix now.

meichstedt commented 8 years ago

… cherry-picked onto releases/0.15