douban / dpark

Python clone of Spark, a MapReduce alike framework in Python
BSD 3-Clause "New" or "Revised" License
2.69k stars 534 forks source link

[INFO] [dpark.context] no web server created as No module named gevent.pywsgi #84

Closed ellenzhu closed 6 years ago

ellenzhu commented 6 years ago

请问这个是什么意思啊?程序可以在另一台机器上正常运行。

ariesdevil commented 6 years ago

@ellenzhu web ui 需要装一下 gevent,稍后会在 setup.py 里修复这个问题,不过没有这个不影响任务执行。

ellenzhu commented 6 years ago

装好之后又出现了这个:[INFO] [dpark.context] start listening on Web UI with port: 33920, 啥意思?

ariesdevil commented 6 years ago

@qingfeng 字面意思,web ui 启动在了 33920 这个端口上

ellenzhu commented 6 years ago

@ariesdevil 但是为啥还是跑不起来呢? 不显示结果

ariesdevil commented 6 years ago

@ellenzhu 贴一下具体信息?

ellenzhu commented 6 years ago

运行不成功显示:[@ffff /data]# python analyze.py 2018-07-27 16:43:18,857 [INFO] [dpark.context] start listening on Web UI with port: 39013 正常服务器运行显示:[@ffff /data/]# python analyze.py [(u'021', 1), (u'038', 80), (u'036', 18), (u'007', 1), (1531888385.919384, 1), (u'039', 83), (u'035', 8), (u'037', 10), (u'020', 5)]

ariesdevil commented 6 years ago

@ellenzhu 是在 mesos 上跑吗?可以看到任务已经提交到 mesos 了吗?

ellenzhu commented 6 years ago

@ariesdevil 不是的 单机上,另一台服务器之前装好的dpark可以正常跑,现在这台也想装上dprak, 跑一下就这样了。

ariesdevil commented 6 years ago

@ellenzhu 执行的时候加 -v 参数看一下 log

ellenzhu commented 6 years ago

@ariesdevil log: 2018-07-30 10:40:55,457 [INFO] [dpark.context] start listening on Web UI with port: 35591 2018-07-30 10:40:55,483 [DEBUG] [dpark.env] start env in 16675: True {'is_local': True} 2018-07-30 10:40:55,484 [DEBUG] [dpark.tracker] TrackerServer started at tcp://forrest16-71-142:39146 2018-07-30 10:40:55,586 [DEBUG] [dpark.shuffle] shuffle dir: ['/dev/shm/forrest16-71-142-b5791055-a3f4-4a92-840f-5c5c7c4ca6dd', '/tmp/dpark/forrest16-71-142-b5791055-a3f4-4a92-840f-5c5c7c4ca6dd'] 2018-07-30 10:40:55,586 [DEBUG] [dpark.shuffle] MapOutputTracker started 2018-07-30 10:40:55,588 [DEBUG] [dpark.broadcast] broadcast started: tcp://forrest16-71-142:35866 2018-07-30 10:40:55,588 [DEBUG] [dpark.env] env started 2018-07-30 10:40:55,589 [DEBUG] [dpark.schedule] new stage: <Stage(1) for <MappedRDD <MappedRDD <ParallelCollection 258>>>> 2018-07-30 10:40:55,589 [DEBUG] [dpark.schedule] new stage: <Stage(2) for <ShuffledRDD <MappedRDD <MappedRDD <ParallelCollection 258>>>>> 2018-07-30 10:40:55,589 [DEBUG] [dpark.schedule] Final stage: <Stage(2) for <ShuffledRDD <MappedRDD <MappedRDD <ParallelCollection 258>>>>>, 2 2018-07-30 10:40:55,589 [DEBUG] [dpark.schedule] Parents of final stage: [<dpark.schedule.Stage instance at 0x2a56290>] 2018-07-30 10:40:55,589 [DEBUG] [dpark.schedule] Missing parents: [<dpark.schedule.Stage instance at 0x2a56290>] 2018-07-30 10:40:55,590 [DEBUG] [dpark.schedule] submit stage <Stage(2) for <ShuffledRDD <MappedRDD <MappedRDD <ParallelCollection 258>>>>> 2018-07-30 10:40:55,590 [DEBUG] [dpark.broadcast] guide start at tcp://forrest16-71-142:35866 2018-07-30 10:40:55,590 [DEBUG] [dpark.schedule] submit stage <Stage(1) for <MappedRDD <MappedRDD <ParallelCollection 258>>>> 2018-07-30 10:40:55,590 [DEBUG] [dpark.schedule] add to pending 2 tasks 2018-07-30 10:40:55,590 [DEBUG] [dpark.schedule] submit tasks [<ShuffleTask(1, 0) of <MappedRDD <MappedRDD <ParallelCollection 258>>>>, <ShuffleTask(1, 1) of <MappedRDD <MappedRDD <ParallelCollection 258>>>>] in LocalScheduler 2018-07-30 10:40:55,592 [DEBUG] [dpark.schedule] Running task <ShuffleTask(1, 0) of <MappedRDD <MappedRDD <ParallelCollection 258>>>> 2018-07-30 10:40:55,593 [DEBUG] [dpark.task] shuffling 0 of <MappedRDD <MappedRDD <ParallelCollection 258>>> 2018-07-30 10:40:55,599 [DEBUG] [dpark.schedule] Running task <ShuffleTask(1, 1) of <MappedRDD <MappedRDD <ParallelCollection 258>>>> 2018-07-30 10:40:55,599 [DEBUG] [dpark.task] shuffling 1 of <MappedRDD <MappedRDD <ParallelCollection 258>>> 2018-07-30 10:40:55,602 [DEBUG] [dpark.schedule] remove from pending <ShuffleTask(1, 0) of <MappedRDD <MappedRDD <ParallelCollection 258>>>> from <Stage(1) for <MappedRDD <MappedRDD <ParallelCollection 258>>>> 2018-07-30 10:40:55,602 [DEBUG] [dpark.schedule] remove from pending <ShuffleTask(1, 1) of <MappedRDD <MappedRDD <ParallelCollection 258>>>> from <Stage(1) for <MappedRDD <MappedRDD <ParallelCollection 258>>>> 2018-07-30 10:40:55,602 [DEBUG] [dpark.schedule] <Stage(1) for <MappedRDD <MappedRDD <ParallelCollection 258>>>> finished; looking for newly runnable stages

ariesdevil commented 6 years ago

@ellenzhu 方便贴一下代码吗?