mesosphere-backup / docker-containers

Dockerfiles and assets for building Docker containers
175 stars 73 forks source link

Issue when starting mesos-slave:1.0.11.0.1-2.0.93.ubuntu1404 without mapping to host's docker #68

Open MingChenSlot opened 8 years ago

MingChenSlot commented 8 years ago

I looked at the Dockerfile, it seems that the latest mesos-slave image already installed docker-engine when building the image. So I tried to run it without adding mapping to host's docker,

docker run -d --net=host \
     --privileged \
    -e MESOS_PORT=5051 \
    -e MESOS_MASTER=zk://[zk_host]/mesos \
    -e MESOS_SWITCH_USER=0 \
    -e MESOS_CONTAINERIZERS=docker,mesos \
    -e MESOS_LOG_DIR=/var/log/mesos \
    -e MESOS_WORK_DIR=/var/tmp/mesos \
    -v /var/log/mesos:/var/log/mesos \
    -v "$(pwd)/tmp/mesos:/var/tmp/mesos" \
    --name slave mesosphere/mesos-slave:1.0.11.0.1-2.0.93.ubuntu1404

but I got following error

WARNING: Logging before InitGoogleLogging() is written to STDERR
I0922 21:23:24.697176     1 main.cpp:243] Build: 2016-08-26 23:00:07 by ubuntu
I0922 21:23:24.697485     1 main.cpp:244] Version: 1.0.1
I0922 21:23:24.697528     1 main.cpp:247] Git tag: 1.0.1
I0922 21:23:24.697577     1 main.cpp:251] Git SHA: 3611eb0b7eea8d144e9b2e840e0ba16f2f659ee3
I0922 21:23:24.698982     1 logging.cpp:194] INFO level logging started!
SELinux:  Could not open policy file <= /etc/selinux/targeted/policy/policy.30:  No such file or directory
I0922 21:23:24.806118     1 containerizer.cpp:196] Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni
I0922 21:23:24.812340     1 linux_launcher.cpp:101] Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
2016-09-22 21:23:24,814:1(0x7f310209c700):ZOO_INFO@log_env@726: Client environment:zookeeper.version=zookeeper C client 3.4.8
2016-09-22 21:23:24,814:1(0x7f310209c700):ZOO_INFO@log_env@730: Client environment:host.name=iZuf659w5fbowj61aj370mZ
2016-09-22 21:23:24,814:1(0x7f310209c700):ZOO_INFO@log_env@737: Client environment:os.name=Linux
2016-09-22 21:23:24,814:1(0x7f310209c700):ZOO_INFO@log_env@738: Client environment:os.arch=4.3.6-coreos
2016-09-22 21:23:24,814:1(0x7f310209c700):ZOO_INFO@log_env@739: Client environment:os.version=#2 SMP Tue May 3 21:48:31 UTC 2016
2016-09-22 21:23:24,814:1(0x7f310209c700):ZOO_INFO@log_env@747: Client environment:user.name=(null)
2016-09-22 21:23:24,814:1(0x7f310209c700):ZOO_INFO@log_env@755: Client environment:user.home=/root
2016-09-22 21:23:24,814:1(0x7f310209c700):ZOO_INFO@log_env@767: Client environment:user.dir=/
2016-09-22 21:23:24,814:1(0x7f310209c700):ZOO_INFO@zookeeper_init@800: Initiating client connection, host=[zk_host]:2181 sessionTimeout=10000 watcher=0x7f310b5f16d0 sessionId=0 sessionPasswd=<null> context=0x7f30d8000930 flags=0
I0922 21:23:24.815431     1 main.cpp:434] Starting Mesos agent
I0922 21:23:24.816761    12 slave.cpp:198] Agent started on 1)@[slave_host]:5051
I0922 21:23:24.816844    12 slave.cpp:199] Flags at startup: --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="/tmp/mesos/store/appc" --authenticate_http_readonly="false" --authenticate_http_readwrite="false" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="docker,mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_command_executor="false" --image_provisioner_backend="copy" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --launcher_dir="/usr/libexec/mesos" --log_dir="/var/log/mesos" --logbufsecs="0" --logging_level="INFO" --master="zk://139.196.217.136:2181/mesos" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="false" --systemd_enable_support="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/var/tmp/mesos"
I0922 21:23:24.817601    12 slave.cpp:519] Agent resources: cpus(*):1; mem(*):498; disk(*):32200; ports(*):[31000-32000]
I0922 21:23:24.817693    12 slave.cpp:527] Agent attributes: [  ]
I0922 21:23:24.817754    12 slave.cpp:532] Agent hostname: iZuf659w5fbowj61aj370mZ
I0922 21:23:24.820996    15 state.cpp:57] Recovering state from '/var/tmp/mesos/meta'
I0922 21:23:24.826390     9 status_update_manager.cpp:200] Recovering status update manager
I0922 21:23:24.826737    14 docker.cpp:775] Recovering Docker containers
I0922 21:23:24.826841    11 containerizer.cpp:522] Recovering containerizer
I0922 21:23:24.830144     9 provisioner.cpp:253] Provisioner recovery complete
2016-09-22 21:23:24,830:1(0x7f30fe05a700):ZOO_INFO@check_events@1728: initiated connection to server [139.196.217.136:2181]
2016-09-22 21:23:24,832:1(0x7f30fe05a700):ZOO_INFO@check_events@1775: session establishment complete on server [zk_host:2181], sessionId=0x157533ffd4b000d, negotiated timeout=10000
I0922 21:23:24.833214    16 group.cpp:349] Group process (group(1)@[slave_host]:5051) connected to ZooKeeper
I0922 21:23:24.833277    16 group.cpp:837] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
I0922 21:23:24.833339    16 group.cpp:427] Trying to create path '/mesos' in ZooKeeper
I0922 21:23:24.835343    16 detector.cpp:152] Detected a new leader: (id='1')
I0922 21:23:24.835597    16 group.cpp:706] Trying to get '/mesos/json.info_0000000001' in ZooKeeper
I0922 21:23:24.836369    16 zookeeper.cpp:259] A new leading master (UPID=master@[master_host]:5050) is detected
Failed to perform recovery: Collect failed: Failed to run 'docker -H unix:///var/run/docker.sock ps -a': exited with status 1; stderr='Cannot connect to the Docker daemon. Is the docker daemon running on this host?
'
To remedy this do as follows:
Step 1: rm -f /var/tmp/mesos/meta/slaves/latest
        This ensures agent doesn't recover old live executors.
Step 2: Restart the agent.

The docker-engine is installed but I still cannot connect to the Docker daemon, it seems like root user has to be added to docker user group. Is this the problem? Or do I miss something?

sebastian-alfers commented 7 years ago

+1

greggomann commented 7 years ago

It looks like the image has Docker installed, but the service isn't started by default. I was able to get the agent running by running the base mesosphere/mesos image in an interactive shell, starting the Docker daemon, and then running the Mesos agent. I'll update the README with some instructions to accomplish this.