mesos / chronos

Fault tolerant job scheduler for Mesos which handles dependencies and ISO8601 based schedules
http://mesos.github.io/chronos/
Apache License 2.0
4.39k stars 529 forks source link

Can't run docker container with bad format for volumes #250

Open cloudysunny14 opened 10 years ago

cloudysunny14 commented 10 years ago

I am trying to run Mesos 0.20 docker integration, but I've got following error:

Container '5e96940d-1d0b-471e-80ed-e24f4be920cd' for executor 'ct:1409450377876:0:test_task2' of framework '20140817-165256-4015564992-5050-3654-0000' failed to start: Failed to 'docker run -d -c 102 -m 104857600 -e mesos_task_id=ct:1409450377876:0:test_task2 -e CHRONOS_JOB_OWNER= -e MESOS_SANDBOX=/mnt/mesos/sandbox -v /mnt/mesos/sandbox:/var/mesos-slave/dockervolume:rw -v /tmp/mesos/slaves/20140831-094421-4015564992-5050-27663-0/frameworks/20140817-165256-4015564992-5050-3654-0000/executors/ct:1409450377876:0:test_task2/runs/5e96940d-1d0b-471e-80ed-e24f4be920cd:/mnt/mesos/sandbox --net host --name mesos-5e96940d-1d0b-471e-80ed-e24f4be920cd debian:stable /bin/sh -c sleep 30': exit status = exited with status 2 stderr = invalid value "/tmp/mesos/slaves/20140831-094421-4015564992-5050-27663-0/frameworks/20140817-165256-4015564992-5050-3654-0000/executors/ct:1409450377876:0:test_task2/runs/5e96940d-1d0b-471e-80ed-e24f4be920cd:/mnt/mesos/sandbox" for flag -v: bad format for volumes: /tmp/mesos/slaves/20140831-094421-4015564992-5050-27663-0/frameworks/20140817-165256-4015564992-5050-3654-0000/executors/ct:1409450377876:0:test_task2/runs/5e96940d-1d0b-471e-80ed-e24f4be920cd:/mnt/mesos/sandbox

I've fix TaskUtils.scala as follows:

val taskIdTemplate = "ct_%d_%d_%s"
val taskIdPattern = """ct_(\d+)_(\d+)_%s""".format(JobUtils.jobNamePattern).r

this appears work fine. However, I don't know the range of influence to changing the format of TaskID:(

Kiyonari Harigae

chengweiv5 commented 10 years ago

This seems is a docker limitation, which doesn't support colon in volume, however colon is a valid character of file entry. So if you have a local directory like /:dir, you can not mount it to your docker container like "-v /:dir:/mnt/dir", apparently it will fail because docker use colon as separate character.

elingg commented 10 years ago

This is a bug in docker. There will be a work around done in mesos to avoid using the host directory by use of a simlink and mapping. I have opened a JIRA issue regarding this. https://issues.apache.org/jira/browse/MESOS-1833

rmrf commented 10 years ago

also hit this bug

tyrannasaurusbanks commented 10 years ago

:+1:

andrioni commented 9 years ago

Fixed in Mesos 0.21.0, however a similar issue can still arise on non-Docker containers if the job sets $PATH or $CLASSPATH based on its directory, because bash can't handle colons in those environment variables. (Spark does this, which is how I got to this issue.)

vdebergue commented 9 years ago

Got the same problem when running spark jobs from chronos. I'll be using a custom build of chronos with the fix from above.

Saurabh2004in commented 9 years ago

With custom build (Exact change as above), everything looks good, But job is not submitted into mesos. (Dont see any error on log) Any clue???