twitter-archive / mysos

Cotton (formerly known as Mysos)
https://incubator.apache.org/projects/cotton.html
589 stars 67 forks source link

slave error: Failed to fetch URIs for container #63

Closed imansadooghi closed 9 years ago

imansadooghi commented 9 years ago

I'm trying to run mysos on Openstack. Since I'm not using vagrant ( unable to start a VM inside an opentack instance) I had to change the scripts. here is the list of modifications: • changed the hardcoded ip address to the host private ip address on config and script files. • changed the username from vagrant to ubuntu on config and script files. I was able to install and start zookeeper, mesos and mysos-scheduler. they are all connected through zookeeper. here is how I run different services:

mesos-master:


sudo mesos-master \
--zk=zk://1.125.1.5:2181/mesos/master \
--ip=1.125.1.5 \
--work_dir=/home/ubuntu/var/local/mesos/master/db \
--quorum=1 \
--roles=mysos \
--credentials=/home/ubuntu/mysos/vagrant/etc/framework_keys.txt \
--log_dir=/home/ubuntu/log-mysos/master \
 --no-authenticate_slave
mesos-slave:

sudo mesos-slave \ 
--master=zk://1.125.1.5.1:2181/mesos/master \ 
--ip=1.125.1.5 \ 
--hostname=1.125.1.5 \ 
--resources="cpus(mysos):4;mem(mysos):1024;disk(mysos):20000;ports(mysos):[31000-32000]" \ 
--isolation="cgroups/cpu,cgroups/mem" \ 
--cgroups_enable_cfs \ 
--log_dir=/home/ubuntu/log-mysos/slave  \ 
--frameworks_home=/home/ubuntu/mysos/vagrant/bin
mysos-scheduler:

mysos_scheduler \
    --port=55001 \
    --framework_user=ubuntu \
    --mesos_master=zk://1.125.1.5:2181/mesos/master \
    --executor_uri=/home/ubuntu/mysos/dist/mysos-0.1.0-dev0.zip \
    --executor_cmd=/home/ubuntu/mysos/vagrant/bin/mysos_executor.sh \
    --zk_url=zk://1.125.1.5:2181/mysos \
    --admin_keypath=/home/ubuntu/mysos/vagrant/etc/admin_keyfile.yml \
    --framework_failover_timeout=1m \
    --framework_role=mysos \
    --framework_authentication_file=/home/ubuntu/mysos/vagrant/etc/fw_auth_keyfile.yml \
    --scheduler_keypath=/home/ubuntu/mysos/vagrant/etc/scheduler_keyfile.txt \
    --executor_source_prefix='vagrant.devcluster' \
    --executor_environ='[{"name": "MYSOS_DEFAULTS_FILE", "value": "/etc/mysql/conf.d/my5.6.cnf"}]'
now, when I try to create a cluster using the following command:
curl -X POST mysos_host_ip:55001/clusters/test_cluster3 --form "cluster_user=mysos"
on mysos scheduler:

I0707 14:12:27.885504 13297 connection.py:276] Sending request(xid=119): Exists(path='/mysos/state/clusters', watcher=None)
I0707 14:12:27.886852 13297 connection.py:360] Received response(xid=119): ZnodeStat(czxid=9180, mzxid=9180, ctime=1436278298268, mtime=1436278298268, version=0, cversion=1, aversion=0, ephemeralOwner=0, dataLength=0, numChildren=1, pzxid=9181)
I0707 14:12:27.887795 13297 connection.py:276] Sending request(xid=120): Exists(path='/mysos/state/clusters/test_cluster2', watcher=None)
I0707 14:12:27.888803 13297 connection.py:360] Received response(xid=120): ZnodeStat(czxid=9181, mzxid=9214, ctime=1436278298278, mtime=1436278345754, version=29, cversion=0, aversion=0, ephemeralOwner=0, dataLength=1400, numChildren=0, pzxid=9181)
I0707 14:12:27.889123 13297 connection.py:276] Sending request(xid=121): SetData(path='/mysos/state/clusters/test_cluster2', data="ccopy_reg\n_reconstructor\np1\n(cmysos.scheduler.state\nMySQLCluster\np2\nc__builtin__\nobject\np3\nNtRp4\n(dp5\nS'encrypted_password'\np6\ng1\n(cnacl.utils\nEncryptedMessage\np7\nc__b  uiltin__\nstr\np8\nS'A\\xdeI!J\\xef\\x86\\xae\\\\\\xd3\\x92!\\xef0\\xab\\x91\\xf1\\xab\\xbcP\\x95\\xd1\\x18<\\xcf\\xe5}Gu\\xc9\\nK&\\xd3\\x0eAF\\x80D\\x89T\\x06s\\xc1w\\xf4\\x1c\\xe9\\xa4\\xe5\\x10$\\xa2\\x94\\r\\x86\\x00=8&\\xff'\ntRp9\n(dp10\nS'_ciphertext'\np11\nS'\\xcf\\xe5}Gu\\xc9\\nK&\\xd3\\x0eAF\\x80D\\x89T  \\x06s\\xc1w\\xf4\\x1c\\xe9\\xa4\\xe5\\x10$\\xa2\\x94\\r\\x86\\x00=8&\\xff'\np12\nsS'_nonce'\np13\nS'A\\xdeI!J\\xef\\x86\\xae\\\\\\xd3\\x92!\\xef0\\xab\\x91\\xf1\\xab\\xbcP\\x95\\xd1\\x18<'\np14\nsbsS'backup_id'\np15\nNsS'name'\np16\nS'test_cluster2'\np17\nsS'mem'\np18\ng1\n(ctwitter.common.quantity\nAmount\np19\n  g3\nNtRp20\n(dp21\nS'_unit'\np22\ng1\n(ctwitter.common.quantity\nData\np23\ng3\nNtRp24\n(dp25\nS'_multiplier'\np26\nI1048576\nsS'_display'\np27\nS'MB'\np28\nsbsS'_amount'\np29\nI512\nsbsS'cpus'\np30\nF1\nsS'num_nodes'\np31\nI1\nsS'tasks'\np32\n(dp33\nsS'user'\np34\nS'mysos'\np35\nsS'members'\np36\n(dp37\nsS'master  _id'\np38\nNsS'next_epoch'\np39\nI0\nsS'next_id'\np40\nI15\nsS'disk'\np41\ng1\n(g19\ng3\nNtRp42\n(dp43\ng22\ng1\n(g23\ng3\nNtRp44\n(dp45\ng26\nI1073741824\nsg27\nS'GB'\np46\nsbsg29\nI2\nsbsb.", version=-1)
I0707 14:12:27.909009 13297 connection.py:360] Received response(xid=121): ZnodeStat(czxid=9181, mzxid=9215, ctime=1436278298278, mtime=1436278347889, version=30, cversion=0, aversion=0, ephemeralOwner=0, dataLength=1133, numChildren=0, pzxid=9181)
I0707 14:12:27.909272 13297 launcher.py:484] Checkpointed the status update for task mysos-test_cluster2-14 of cluster test_cluster2
I0707 14:12:28.751266 13297 launcher.py:185] Launcher test_cluster2 accepted offer 20150707-140838-83983617-5050-13042-22 on Mesos slave 20150707-140838-83983617-5050-13042-0 (1.125.1.5)
I0707 14:12:28.751960 13297 launcher.py:305] Executor will use environment variable: {u'name': u'MYSOS_DEFAULTS_FILE', u'value': u'/etc/mysql/conf.d/my5.6.cnf'}
I0707 14:12:28.752923 13297 connection.py:276] Sending request(xid=122): Exists(path='/mysos/state/clusters', watcher=None)
I0707 14:12:28.754126 13297 connection.py:360] Received response(xid=122): ZnodeStat(czxid=9180, mzxid=9180, ctime=1436278298268, mtime=1436278298268, version=0, cversion=1, aversion=0, ephemeralOwner=0, dataLength=0, numChildren=1, pzxid=9181)
I0707 14:12:28.754930 13297 connection.py:276] Sending request(xid=123): Exists(path='/mysos/state/clusters/test_cluster2', watcher=None)
I0707 14:12:28.755887 13297 connection.py:360] Received response(xid=123): ZnodeStat(czxid=9181, mzxid=9215, ctime=1436278298278, mtime=1436278347889, version=30, cversion=0, aversion=0, ephemeralOwner=0, dataLength=1133, numChildren=0, pzxid=9181)
I0707 14:12:28.756345 13297 connection.py:276] Sending request(xid=124): SetData(path='/mysos/state/clusters/test_cluster2', data="ccopy_reg\n_reconstructor\np1\n(cmysos.scheduler.state\nMySQLCluster\np2\nc__builtin__\nobject\np3\nNtRp4\n(dp5\nS'encrypted_password'\np6\ng1\n(cnacl.utils\nEncryptedMessage\np7\nc__b  uiltin__\nstr\np8\nS'A\\xdeI!J\\xef\\x86\\xae\\\\\\xd3\\x92!\\xef0\\xab\\x91\\xf1\\xab\\xbcP\\x95\\xd1\\x18<\\xcf\\xe5}Gu\\xc9\\nK&\\xd3\\x0eAF\\x80D\\x89T\\x06s\\xc1w\\xf4\\x1c\\xe9\\xa4\\xe5\\x10$\\xa2\\x94\\r\\x86\\x00=8&\\xff'\ntRp9\n(dp10\nS'_ciphertext'\np11\nS'\\xcf\\xe5}Gu\\xc9\\nK&\\xd3\\x0eAF\\x80D\\x89T  \\x06s\\xc1w\\xf4\\x1c\\xe9\\xa4\\xe5\\x10$\\xa2\\x94\\r\\x86\\x00=8&\\xff'\np12\nsS'_nonce'\np13\nS'A\\xdeI!J\\xef\\x86\\xae\\\\\\xd3\\x92!\\xef0\\xab\\x91\\xf1\\xab\\xbcP\\x95\\xd1\\x18<'\np14\nsbsS'backup_id'\np15\nNsS'name'\np16\nS'test_cluster2'\np17\nsS'mem'\np18\ng1\n(ctwitter.common.quantity\nAmount\np19\n  g3\nNtRp20\n(dp21\nS'_unit'\np22\ng1\n(ctwitter.common.quantity\nData\np23\ng3\nNtRp24\n(dp25\nS'_multiplier'\np26\nI1048576\nsS'_display'\np27\nS'MB'\np28\nsbsS'_amount'\np29\nI512\nsbsS'cpus'\np30\nF1\nsS'num_nodes'\np31\nI1\nsS'tasks'\np32\n(dp33\nVmysos-test_cluster2-15\np34\ng1\n(cmysos.scheduler.state\nMySQL  Task\np35\ng3\nNtRp36\n(dp37\nS'hostname'\np38\nV1.125.1.5\np39\nsS'task_id'\np40\ng34\nsS'mesos_slave_id'\np41\nV20150707-140838-83983617-5050-13042-0\np42\nsS'cluster_name'\np43\ng17\nsS'state'\np44\nI6\nsS'port'\np45\nI31400\nsbssS'user'\np46\nS'mysos'\np47\nsS'members'\np48\n(dp49\nsS'master_id'\np50\nNsS'next  _epoch'\np51\nI0\nsS'next_id'\np52\nI16\nsS'disk'\np53\ng1\n(g19\ng3\nNtRp54\n(dp55\ng22\ng1\n(g23\ng3\nNtRp56\n(dp57\ng26\nI1073741824\nsg27\nS'GB'\np58\nsbsg29\nI2\nsbsb.", version=-1)
I0707 14:12:28.763336 13297 connection.py:360] Received response(xid=124): ZnodeStat(czxid=9181, mzxid=9216, ctime=1436278298278, mtime=1436278348756, version=31, cversion=0, aversion=0, ephemeralOwner=0, dataLength=1400, numChildren=0, pzxid=9181)
I0707 14:12:28.763725 13297 launcher.py:202] Launching task mysos-test_cluster2-15 on Mesos slave 20150707-140838-83983617-5050-13042-0 (1.125.1.5)
I0707 14:12:29.879532 13297 launcher.py:395] Updating state of task mysos-test_cluster2-15 of cluster test_cluster2 from TASK_STAGING to TASK_LOST
E0707 14:12:29.879740 13297 launcher.py:443] Task mysos-test_cluster2-15 is now in terminal state TASK_LOST with message 'Executor terminated'
W0707 14:12:29.879869 13297 launcher.py:474] Slave mysos-test_cluster2-15 of cluster test_cluster2 failed to start running
on mesos-master:

I0707 14:12:19.743690 13045 master.cpp:3559] Sending 1 offers to framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:19.778714 13043 master.cpp:2169] Processing reply for offers: [ 20150707-140838-83983617-5050-13042-19 ] on slave 20150707-140838-83983617-5050-13042-0 at slave(1)@1.125.1.5:5051 (1.125.1.5) for framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:19.779111 13043 master.hpp:829] Adding task mysos-test_cluster2-12 with resources cpus(mysos):0.99; mem(mysos):480; disk(mysos):2047; ports(mysos):[31531-31531] on slave 20150707-140838-83983617-5050-13042-0 (1.125.1.5)
I0707 14:12:19.779166 13043 master.cpp:2318] Launching task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000 with resources cpus(mysos):0.99; mem(mysos):480; disk(mysos):2047; ports(mysos):[31531-31531] on slave 20150707-140838-83983617-5050-13042-0 at slave(1)@1.125.1.5:5051 (1.125.1.5)
I0707 14:12:19.779368 13043 hierarchical_allocator_process.hpp:563] Recovered cpus(mysos):3; mem(mysos):512; disk(mysos):17952; ports(mysos):[31000-31530, 31532-32000](total allocatable: cpus%28mysos%29:3; mem%28mysos%29:512; disk%28mysos%29:17952; ports%28mysos%29:[31000-31530, 31532-32000]) on slave 20150707-140838-83983617-5050-13042-0 from framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.833107 13049 master.cpp:3229] Executor mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000 on slave 20150707-140838-83983617-5050-13042-0 at slave(1)@1.125.1.5:5051 (1.125.1.5) exited with status 1
I0707 14:12:21.833279 13049 hierarchical_allocator_process.hpp:563] Recovered cpus(mysos):0.01; mem(mysos):32; disk(mysos):1 (total allocatable: cpus(mysos):3.01; mem(mysos):544; disk(mysos):17953; ports(mysos):[31000-31530, 31532-32000]) on slave 20150707-140838-83983617-5050-13042-0 from framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.873121 13050 master.cpp:3180] Forwarding status update TASK_LOST (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.873198 13050 master.cpp:3146] Status update TASK_LOST (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000 from slave 20150707-140838-83983617-5050-13042-0 at slave(1)@1.125.1.5:5051 (1.125.1.5)
I0707 14:12:21.873272 13050 master.hpp:847] Removing task mysos-test_cluster2-12 with resources cpus(mysos):0.99; mem(mysos):480; disk(mysos):2047; ports(mysos):[31531-31531] on slave 20150707-140838-83983617-5050-13042-0 (1.125.1.5)
I0707 14:12:21.873401 13050 hierarchical_allocator_process.hpp:563] Recovered cpus(mysos):0.99; mem(mysos):480; disk(mysos):2047; ports(mysos):[31531-31531](total allocatable: cpus%28mysos%29:4; mem%28mysos%29:1024; disk%28mysos%29:20000; ports%28mysos%29:[31000-32000]) on slave 20150707-140838-83983617-5050-13042-0 from framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.887816 13046 master.cpp:2661] Forwarding status update acknowledgement bdc5ce90-70c6-4ac4-bf7a-edde3b20c791 for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000 to slave 20150707-140838-83983617-5050-13042-0 at slave(1)@1.125.1.5:5051 (1.125.1.5)
on mesos-slave:

I0707 14:12:19.779917 13068 slave.cpp:1002] Got assigned task mysos-test_cluster2-12 for framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:19.780154 13068 slave.cpp:3536] Checkpointing FrameworkInfo to '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/framework.info'
I0707 14:12:19.780479 13068 slave.cpp:3543] Checkpointing framework pid 'scheduler-ad3e3804-3846-4677-8670-ea30bb166013@1.125.1.5:50160' to '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/framework.pid'
I0707 14:12:19.780894 13068 gc.cpp:84] Unscheduling '/tmp/mesos/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000' from gc
I0707 14:12:19.781018 13068 gc.cpp:84] Unscheduling '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000' from gc
I0707 14:12:19.781136 13068 slave.cpp:1112] Launching task mysos-test_cluster2-12 for framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:19.782254 13068 slave.cpp:3857] Checkpointing ExecutorInfo to '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/executors/mysos-test_cluster2-12/executor.info'
I0707 14:12:19.782737 13068 slave.cpp:3972] Checkpointing TaskInfo to '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/executors/mysos-test_cluster2-12/runs/e89291b4-cf6e-43ad-9423-e2740beb9f4d/tasks/mysos-test_cluster2-12/task.info'
I0707 14:12:19.782922 13064 containerizer.cpp:394] Starting container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d' for executor 'mysos-test_cluster2-12' of framework '20150707-140838-83983617-5050-13042-0000'
I0707 14:12:19.782939 13068 slave.cpp:1222] Queuing task 'mysos-test_cluster2-12' for executor mysos-test_cluster2-12 of framework '20150707-140838-83983617-5050-13042-0000
I0707 14:12:19.785490 13064 mem.cpp:479] Started listening for OOM events for container e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:19.786022 13064 mem.cpp:293] Updated 'memory.soft_limit_in_bytes' to 512MB for container e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:19.786504 13068 cpushare.cpp:338] Updated 'cpu.shares' to 1024 (cpus 1) for container e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:19.787155 13064 mem.cpp:358] Updated 'memory.limit_in_bytes' to 512MB for container e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:19.787747 13068 cpushare.cpp:359] Updated 'cpu.cfs_period_us' to 100ms and 'cpu.cfs_quota_us' to 100ms (cpus 1) for container e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:19.789216 13068 linux_launcher.cpp:191] Cloning child process with flags = 0
I0707 14:12:19.790909 13068 containerizer.cpp:678] Checkpointing executor's forked pid 13451 to '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/executors/mysos-test_cluster2-12/runs/e89291b4-cf6e-43ad-9423-e2740beb9f4d/pids/forked.pid'
I0707 14:12:19.793015 13068 containerizer.cpp:510] Fetching URIs for container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d' using command '/usr/local/libexec/mesos/mesos-fetcher'
I0707 14:12:20.824784 13070 containerizer.cpp:882] Destroying container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d'
E0707 14:12:20.825043 13067 slave.cpp:2485] Container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d' for executor 'mysos-test_cluster2-12' of framework '20150707-140838-83983617-5050-13042-0000' failed to start: Failed to fetch URIs for container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d': exit status 256
I0707 14:12:20.826287 13070 cgroups.cpp:2208] Freezing cgroup /sys/fs/cgroup/freezer/mesos/e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:20.827714 13063 cgroups.cpp:1375] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos/e89291b4-cf6e-43ad-9423-e2740beb9f4d after 1.239808ms
I0707 14:12:20.828982 13063 cgroups.cpp:2225] Thawing cgroup /sys/fs/cgroup/freezer/mesos/e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:20.830205 13063 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/e89291b4-cf6e-43ad-9423-e2740beb9f4d after 1.078016ms
I0707 14:12:21.826225 13070 containerizer.cpp:997] Executor for container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d' has exited
I0707 14:12:21.831550 13067 slave.cpp:2596] Executor 'mysos-test_cluster2-12' of framework 20150707-140838-83983617-5050-13042-0000 exited with status 1
E0707 14:12:21.831750 13065 slave.cpp:2866] Failed to unmonitor container for executor mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000: Not monitored
I0707 14:12:21.832567 13067 slave.cpp:2088] Handling status update TASK_LOST (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000 from @0.0.0.0:0
W0707 14:12:21.832794 13064 containerizer.cpp:788] Ignoring update for unknown container: e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:21.833605 13064 status_update_manager.cpp:320] Received status update TASK_LOST (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.833945 13064 status_update_manager.hpp:342] Checkpointing UPDATE for status update TASK_LOST (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.872320 13064 status_update_manager.cpp:373] Forwarding status update TASK_LOST (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000 to master@1.125.1.5:5050
I0707 14:12:21.888372 13064 status_update_manager.cpp:398] Received status update acknowledgement (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.888582 13064 status_update_manager.hpp:342] Checkpointing ACK for status update TASK_LOST (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.927053 13064 slave.cpp:2732] Cleaning up executor 'mysos-test_cluster2-12' of framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.927609 13064 slave.cpp:2807] Cleaning up framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.927803 13067 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/executors/mysos-test_cluster2-12/runs/e89291b4-cf6e-43ad-9423-e2740beb9f4d' for gc 6.99998926540444days in the future
I0707 14:12:21.927835 13068 status_update_manager.cpp:282] Closing status update streams for framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.927999 13067 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/executors/mysos-test_cluster2-12' for gc 6.99998926446815days in the future
I0707 14:12:21.928270 13067 gc.cpp:56] Scheduling '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/executors/mysos-test_cluster2-12/runs/e89291b4-cf6e-43ad-9423-e2740beb9f4d' for gc 6.99998926416593days in the future
I0707 14:12:21.928318 13067 gc.cpp:56] Scheduling '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/executors/mysos-test_cluster2-12' for gc 6.99998926392296days in the future
I0707 14:12:21.928351 13067 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000' for gc 6.99998926238222days in the future
I0707 14:12:21.928390 13067 gc.cpp:56] Scheduling '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000' for gc 6.99998926207111days in the future
when I check the mesos-slave logs. I see this:

E0707 14:12:20.825043 13067 slave.cpp:2485] Container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d' for executor 'mysos-test_cluster2-12' of framework '20150707-140838-83983617-5050-13042-0000' failed to start: Failed to fetch URIs for container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d': exit status 256
E0707 14:12:21.831750 13065 slave.cpp:2866] Failed to unmonitor container for executor mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000: Not monitored
and here is my sandbox stderr which explains more:
I0707 15:47:32.898041 15795 fetcher.cpp:76] Fetching URI '/home/ubuntu/mysos/dist/mysos-0.1.0-dev0.zip'
I0707 15:47:32.898449 15795 fetcher.cpp:179] Copying resource from '/home/ubuntu/mysos/dist/mysos-0.1.0-dev0.zip' to '/tmp/mesos/slaves/20150707-154245-83983617-5050-14919-0/frameworks/20150707-154245-83983617-5050-14919-0000/executors/mysos-test_cluster2-78/runs/fe5ddc90-5cdf-49fd-8013-e8eb0e451e3f'
cp: cannot stat â/home/ubuntu/mysos/dist/mysos-0.1.0-dev0.zipâ: No such file or directory
E0707 15:47:32.909354 15795 fetcher.cpp:184] Failed to copy '/home/ubuntu/mysos/dist/mysos-0.1.0-dev0.zip' : Exit status 256
Failed to fetch: /home/ubuntu/mysos/dist/mysos-0.1.0-dev0.zip
Failed to synchronize with slave (it's probably exited) 
when I check my dist dir mysos-0.1.0_dev0-py2.7.egg is there. but no .zip files!! What have I missed during the installation? It must have made it at some point! - A few notes about my setup: - all of the services run on the same node, but they use the private-ip(1.125.1.5), not localhost. - my network has a Man-in-the-Middle proxy Any idea what is wrong here?
imansadooghi commented 9 years ago

So I figured out how to make this work. basically mesos-fetcher looks for mysos/dist/mysos-0.1.0-dev0.zip. However there is only mysos-0.1.0_dev0-py2.7.egg in the dist/ folder. I had to manually convert it to whl. then unpack the whl file. and then zip and rename the dir.

imansadooghi commented 9 years ago

OK, I figured this out. The tox creates the .zip file and the provision puts it into dist. I wasnt using the tox to install mysos