mesos / chronos

Fault tolerant job scheduler for Mesos which handles dependencies and ISO8601 based schedules
http://mesos.github.io/chronos/
Apache License 2.0
4.39k stars 529 forks source link

arguments field of chronos job is broken #858

Open setekhid opened 7 years ago

setekhid commented 7 years ago

still related to issue #856 , no clue to debug, just below job json

{  
   "arguments":[  
      "--oplogger",
      "aHR0cDovL29wbG9nZ2VyOjMwODUvYXBpL2xvZ2dpbmcvOThiODM5NjBmMjllN2JhN2M1MjhkMjEyMGJjYTU5YTMtMTUwNjMxMDI1NTc3MjA3OTkwMg==",
      "--offset",
      "2",
      "--",
      "L29wdC93b2xmcGFjaw==",
      "LS1wcm9qZWN0",
      "aHR0cDovL2dpdHRhcjo1NTY2L2FwaS9naXQvZmppZW9hZmVh",
      "LS12ZXJzaW9u",
      "ZmFlZmE=",
      "LS1kaXJlY3Rvcnk=",
      "Li8=",
      "LS0="
   ],
   "async":true,
   "command":"/opt/opcmd",
   "container":{  
      "forcePullImage":false,
      "image":"setekhid/dumpexec:latest",
      "network":"BRIDGE",
      "parameters":[  
         {  
            "key":"privileged",
            "value":"true"
         }
      ],
      "type":"DOCKER"
   },
   "cpus":0.1,
   "disabled":false,
   "disk":256,
   "epsilon":"PT12H",
   "mem":128,
   "name":"98b83960f29e7ba7c528d2120bca59a3-1506310255772079902",
   "retries":2,
   "schedule":"R1/2017-09-25T03:30:55Z/PT12H",
   "scheduleTimeZone":"UTC",
   "shell":false
}

commit to iso8601 api url, and the result is a queued-forever job. but if you remove arguments field, everything seems fine.

some curious logs of chronos is as below

[2017-09-25 03:30:57,319] INFO Launching tasks from offer: id {
    value: "378c0771-4795-443b-9baa-67bf8ab1200a-O443"
}
framework_id {
    value: "378c0771-4795-443b-9baa-67bf8ab1200a-0000"
}
slave_id {
    value: "378c0771-4795-443b-9baa-67bf8ab1200a-S0"
}

...

        variables {
            name: "MESOS_TASK_ID"
            value:"ct:1506310255000:0:98b83960f29e7ba7c528d2120bca59a3-1506310255772079902:--oplogger aHR0cDovL29wbG9nZ2VyOjMwODUvYXBpL2xvZ2dpbmcvOThiODM5NjBmMjllN2JhN2M1MjhkMjEyMGJjYTU
5YTMtMTUwNjMxMDI1NTc3MjA3OTkwMg== --offset 2 -- L29wdC93b2xmcGFjaw== LS1wcm9qZWN0 aHR0cDovL2dpdHRhcjo1NTY2L2FwaS9naXQvZmppZW9hZmVh LS12ZXJzaW9u ZmFlZmE= LS1kaXJlY3Rvcnk= Li8= LS0="
        }

...

the task id is a little weird. while debugging issue #856 , I found out you guys combine id with just string concatenating at TaskUtils.scala of branch releasing-2.5. and parse it directly at here. in my opinion, any meaningful string used as id or partly must be encoded in base64 or escaped characters with other methods. is there any specific reason why would you do this?!