mesos / chronos

Fault tolerant job scheduler for Mesos which handles dependencies and ISO8601 based schedules
http://mesos.github.io/chronos/
Apache License 2.0
4.39k stars 529 forks source link

Chronos support for dynamic reservations #843

Open srikanth-viswanathan opened 7 years ago

srikanth-viswanathan commented 7 years ago

Hi folks,

It does not appear that Chronos supports dynamic reservations correctly. We have some dynamic reservations set up on Mesos, and Chronos is unable to launch tasks using these.

The reservations are for (role=storage.nfs,principal=marathon.core). We tried running Chronos in two ways:

In both cases, Chronos is able to receive offers but when the task is launched, Mesos rejects it with status TASK_ERROR, like so:

mesos-master[52789]: I0701 03:40:55.778859 52851 master.cpp:5194] Sending status update TASK_ERROR for task ct:1498880452462:0:dev_shared-nj1_hello-chronos: of framework dacc0603-d48c-4705-89b7-62ef08e4f2f1-0008 'Task uses more resources cpus(storage.nfs):0.42; mem(storage.nfs):420; disk(*):5000 than available mem(storage.nfs, marathon.core):60320; cpus(storage.nfs, marathon.core):6; ports(storage.nfs, marathon.core):[32100-32740, 32742-33000]; cpus(*):7.2; mem(*):125764; disk(*):337858; disk(*)[]:675867; disk(*)[]:675867; disk(*)[]:675867; disk(*)[]:675867; disk(*)[]:675867; ports(*):[31000-31010, 31012-31144, 31146-31196, 31198-31214, 31216-31257, 31259-31311, 31313-31365, 31367-31411, 31413-31452, 31454-31522, 31524-31549, 31551-31674, 31676-31676, 31680-31842, 31844-31958, 31960-31971, 31973-32004, 32011-32056, 32058-32095, 32097-32099]'

From the looks of it, it appears that Chronos is not passing along the principal correctly, causing mesos to reject the launch operation.

We are running Chronos 2.4 on Mesos 1.0.1.