mesosphere / marathon

Deploy and manage containers (including Docker) on top of Apache Mesos at scale.
https://mesosphere.github.io/marathon/
Apache License 2.0
4.07k stars 844 forks source link

error: The upgrade has been cancelled #3747

Closed shijinkui closed 8 years ago

shijinkui commented 8 years ago

[2016-04-12 21:13:23,713] ERROR Deployment of / failed (mesosphere.marathon.MarathonSchedulerActor:marathon-akka.actor.default-dispatcher-39)
mesosphere.marathon.DeploymentCanceledException: The upgrade has been cancelled
    at mesosphere.marathon.upgrade.DeploymentManager$$anonfun$receive$1$$anonfun$3.apply(DeploymentManager.scala:53) ~[marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at mesosphere.marathon.upgrade.DeploymentManager$$anonfun$receive$1$$anonfun$3.apply(DeploymentManager.scala:52) ~[marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) ~[marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) ~[marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at scala.collection.immutable.List.foreach(List.scala:381) ~[marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:245) ~[marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at scala.collection.immutable.List.map(List.scala:285) ~[marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at mesosphere.marathon.upgrade.DeploymentManager$$anonfun$receive$1.applyOrElse(DeploymentManager.scala:52) ~[marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at akka.actor.Actor$class.aroundReceive(Actor.scala:465) ~[marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at mesosphere.marathon.upgrade.DeploymentManager.aroundReceive(DeploymentManager.scala:23) ~[marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) [marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at akka.actor.ActorCell.invoke(ActorCell.scala:487) [marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) [marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at akka.dispatch.Mailbox.run(Mailbox.scala:221) [marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at akka.dispatch.Mailbox.exec(Mailbox.scala:231) [marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [marathon-assembly-1.1.0-RC1.jar:1.1.0-RC1]
shijinkui commented 8 years ago

can't rollback the deployment, and create new application, it remain wainting state

pierluigi commented 8 years ago

Hi @shijinkui

can you please specify the versions of Marathon and Mesos you're trying to run this on? Also, the JSON for the application definition would help.

shijinkui commented 8 years ago

{
  "id": "/etl/db.abc",
  "cmd": "$SPARK_HOME/bin/spark-submit --name db.xxx --class ETL --master mesos://xxx:5050 --deploy-mode client --executor-memory 8g --total-executor-cores 1 streaming-etl-assembly-2.11.7-1.0.jar",
  "cpus": 1,
  "mem": 3072,
  "disk": 3072,
  "instances": 1,
  "env": {
    "JAVA_HOME": "/data/program/java",
    "SPARK_HOME": "/data/program/spark",
    "HADOOP_HOME": "/data/program/hdfs",
    "HADOOP_CONF_DIR": "/data/program/hdfs/etc/hadoop"
  },
  "portDefinitions": [
    {
      "port": 10002,
      "protocol": "tcp",
      "labels": {}
    }
  ],
  "uris": [
    "http://jar.xxx.xxx.info/streaming-etl-assembly-2.11.7-1.0.jar"
  ],
  "fetch": [
    {
      "uri": "http://jar.xxx.xxx.info/streaming-etl-assembly-2.11.7-1.0.jar",
      "extract": true,
      "executable": false,
      "cache": false
    }
  ]
}
shijinkui commented 8 years ago

when i upgrate to v1.1.1, the problem also exist


[2016-04-19 11:33:29,419] ERROR Deployment of / failed (mesosphere.marathon.MarathonSchedulerActor:marathon-akka.actor.default-dispatcher-11)
mesosphere.marathon.DeploymentCanceledException: The upgrade has been cancelled
    at mesosphere.marathon.upgrade.DeploymentManager$$anonfun$receive$1$$anonfun$3.apply(DeploymentManager.scala:53) ~[marathon-assembly-1.1.1.jar:1.1.1]
    at mesosphere.marathon.upgrade.DeploymentManager$$anonfun$receive$1$$anonfun$3.apply(DeploymentManager.scala:52) ~[marathon-assembly-1.1.1.jar:1.1.1]
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) ~[marathon-assembly-1.1.1.jar:1.1.1]
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) ~[marathon-assembly-1.1.1.jar:1.1.1]
    at scala.collection.immutable.List.foreach(List.scala:381) ~[marathon-assembly-1.1.1.jar:1.1.1]
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:245) ~[marathon-assembly-1.1.1.jar:1.1.1]
    at scala.collection.immutable.List.map(List.scala:285) ~[marathon-assembly-1.1.1.jar:1.1.1]
    at mesosphere.marathon.upgrade.DeploymentManager$$anonfun$receive$1.applyOrElse(DeploymentManager.scala:52) ~[marathon-assembly-1.1.1.jar:1.1.1]
    at akka.actor.Actor$class.aroundReceive(Actor.scala:465) ~[marathon-assembly-1.1.1.jar:1.1.1]
    at mesosphere.marathon.upgrade.DeploymentManager.aroundReceive(DeploymentManager.scala:23) ~[marathon-assembly-1.1.1.jar:1.1.1]
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) [marathon-assembly-1.1.1.jar:1.1.1]
    at akka.actor.ActorCell.invoke(ActorCell.scala:487) [marathon-assembly-1.1.1.jar:1.1.1]
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) [marathon-assembly-1.1.1.jar:1.1.1]
    at akka.dispatch.Mailbox.run(Mailbox.scala:221) [marathon-assembly-1.1.1.jar:1.1.1]
    at akka.dispatch.Mailbox.exec(Mailbox.scala:231) [marathon-assembly-1.1.1.jar:1.1.1]
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [marathon-assembly-1.1.1.jar:1.1.1]
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [marathon-assembly-1.1.1.jar:1.1.1]
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [marathon-assembly-1.1.1.jar:1.1.1]
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [marathon-assembly-1.1.1.jar:1.1.1]
meichstedt commented 8 years ago

Just a side note: you can remove the "uris" node – uris have been deprecated in favor of using the Mesos fetcher cache as you do with "fetch". I don't think that's the problem though.

shijinkui commented 8 years ago

@meichstedt thanks.
i prefer to use http service for jar download.

shijinkui commented 8 years ago

@pierlo-upitup @meichstedt it alway wait. how can marathon be debug local

aa

shijinkui commented 8 years ago

when i restart the mesos master, marathon's deployment can go on. it's very strange. close this issue.