Closed GoSteven closed 11 years ago
For simpler testing, you could run mesos in local mode on the same host
# mesos-local
Then you could test DPark with it:
python demo.py -m localhost:5050 -M 100
By default, DPark need at least 1G memory to run each task, you could use -M to specify another value, such 100M
Hi @davies , your tips are really helpful. I think my problem could due to the wrong version of the underlying dependency.
Have tried using mesos-local:
[root@ip-10-28-135-104 bin]# ./mesos-local I0520 23:54:55.486122 2988 logging.cpp:70] Logging to /root/mesos/logs I0520 23:54:55.503355 2989 master.cpp:264] Master started at mesos://master@10.28.135.104:5050 I0520 23:54:55.503468 2989 master.cpp:279] Master ID: 201305202354-0 I0520 23:54:55.503769 2989 master.cpp:462] Elected as master! I0520 23:54:55.503969 2989 slave.cpp:257] Slave started at slave@10.28.135.104:5050 I0520 23:54:55.504026 2989 slave.cpp:258] Slave resources: cpus=1; mem=1024 I0520 23:54:55.504513 2989 slave.cpp:320] New master detected at master@10.28.135.104:5050 I0520 23:54:55.504706 2989 master.cpp:814] Attempting to register slave 201305202354-0-0 at slave@10.28.135.104:5050 I0520 23:54:55.505198 2989 master.cpp:1057] Master now considering a slave at ip-10-28-135-104.ec2.internal:5050 as active I0520 23:54:55.505271 2989 master.cpp:1588] Adding slave 201305202354-0-0 at ip-10-28-135-104.ec2.internal with cpus=1; mem=1024 I0520 23:54:55.505342 2989 simple_allocator.cpp:71] Added slave 201305202354-0-0 with cpus=1; mem=1024 I0520 23:54:55.505429 2989 slave.cpp:340] Registered with master; given slave ID 201305202354-0-0 libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message of type "mesos.internal.RegisterFrameworkMessage" because it is missing required fields: framework.executor W0520 23:56:08.176432 2989 protobuf.hpp:260] Initialization errors: framework.executor libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message of type "mesos.internal.RegisterFrameworkMessage" because it is missing required fields: framework.executor W0520 23:58:35.337236 2989 protobuf.hpp:260] Initialization errors: framework.executor
I used pip install protobuf
, protobuf version is 2.5.0
The error log of mesos seemed that mesos is not compatible with DPark, you should try Mesos 0.9. We use 0.9 in production for more than a year, with several patches. Just wait a moment, we will push them out.
Mesos 0.9 with out patches: https://github.com/windreamer/mesos/tree/master
Thanks a lot for your help @davies , DPark works fine with Mesos 0.9 when I run mesos-local. I will try to run it on a ec2 cluster shortly.
I set up mesos cluster on Amazon ec2 using mesos EC2-Scripts:
Then I run
python27 demo.py -m mesos://master@ec2-54-224-207-120.compute-1.amazonaws.com:5050 -p 2
The program was halting there, it stuck at submitTasks() in schedule.py. Press Ctrl-C:
ec2-user@ip-10-31-194-149 examples]$ python27 demo.py -m mesos://master@ec2-54-224-207-120.compute-1.amazonaws.com:5050 -p 2 2013-05-20 12:30:43,786 [INFO] [scheduler] Got a job with 4 tasks ^CTraceback (most recent call last): File "demo.py", line 10, in <module> print nums.count() File "/home/ec2-user/dpark/dpark/rdd.py", line 271, in count return sum(self.ctx.runJob(self, lambda x: ilen(x))) File "/home/ec2-user/dpark/dpark/context.py", line 204, in runJob for it in self.scheduler.runJob(rdd, func, partitions, allowLocal): File "/home/ec2-user/dpark/dpark/schedule.py", line 269, in runJob submitStage(finalStage) File "/home/ec2-user/dpark/dpark/schedule.py", line 231, in submitStage submitMissingTasks(stage) File "/home/ec2-user/dpark/dpark/schedule.py", line 267, in submitMissingTasks self.submitTasks(tasks) File "/home/ec2-user/dpark/dpark/schedule.py", line 436, in _ r = f(self, *a, **kw) File "/usr/lib64/python2.7/threading.py", line 154, in __exit__ self.release() File "/usr/lib64/python2.7/threading.py", line 142, in release raise RuntimeError("cannot release un-acquired lock") RuntimeError: cannot release un-acquired lock
In Mesos logs:
Log file created at: 2013/05/20 11:40:49 Running on machine: ip-10-31-194-149.ec2.internal Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg I0520 11:40:49.170337 2099 logging.cpp:70] Logging to /mnt/mesos-logs I0520 11:40:49.172806 2099 main.cpp:95] Build: 2011-12-03 06:24:10 by root I0520 11:40:49.172871 2099 main.cpp:96] Starting Mesos master I0520 11:40:49.176777 2099 webui.cpp:81] Starting master web server on port 8080 I0520 11:40:49.176911 2101 master.cpp:264] Master started at mesos://master@10.31.194.149:5050 I0520 11:40:49.177106 2104 webui.cpp:47] Master web server thread started I0520 11:40:49.177109 2101 master.cpp:279] Master ID: 201305201140-0 I0520 11:40:49.177775 2101 master.cpp:462] Elected as master! I0520 11:40:49.191300 2104 webui.cpp:59] Loading webui/master/webui.py I0520 11:40:54.348682 2101 master.cpp:814] Attempting to register slave 201305201140-0-0 at slave@10.28.0.219:33513 I0520 11:40:54.349149 2101 master.cpp:1057] Master now considering a slave at ip-10-28-0-219.ec2.internal:33513 as active I0520 11:40:54.349210 2101 master.cpp:1588] Adding slave 201305201140-0-0 at ip-10-28-0-219.ec2.internal with cpus=2; mem=677 I0520 11:40:54.349393 2101 simple_allocator.cpp:71] Added slave 201305201140-0-0 with cpus=2; mem=677 W0520 12:00:52.365500 2101 protobuf.hpp:260] Initialization errors: framework.executor
Question:
Does DPark require a specific mesos version? Is there any relavant documentation for setting up DPark+Mesos?