aliyun / aliyun-odps-python-sdk

ODPS Python SDK and data analysis framework
http://pyodps.readthedocs.io
Apache License 2.0
434 stars 97 forks source link

map_reduce() Word_count demo中出现的error #98

Closed joyzhou28 closed 1 week ago

joyzhou28 commented 5 years ago

copy说明文档提供的demo, SQL compiled也显示出来了,但是执行的出现以下问题:

ScriptError: ODPS-0123055: InstanceId: 20190721032139802gqztk8
ODPS-0123055:Script exception - Traceback (most recent call last):
File "<pyodps_udf_1563679279_ad88b317_1b16_4dcb_b102_226b3fd5c207.py>", line 2873, in process**
    for r in self.f(*args, **self.kwargs):
  File "<<ipython-input-25-aacf497909db>>", line 3, in mapper
**TypeError: expected a character buffer object**

某一行代码类型出问题 但是我不知道该怎么debug

demo如下:

word_count = DataFrame(o.get_table('tlab_test.tmp_ttzhou0721_wc3'))
@output(['word','cnt'], ['string','float64'])
def mapper(row):
    for word in row[0].split(0):
        yield word.lower(), 1
@output(['word','cnt'],['string','float64'])
def reducer(keys):
    cnt = [0]
    def h(row, done):
        cnt[0] += row.cnt
        if done:
            yield keys.word, cnt[0]
    return h
word_count.map_reduce(mapper, reducer,group='word')
qinxuye commented 5 years ago

是 Python 3?用 python 2试下。

joyzhou28 commented 5 years ago

我在datago跑了下,Python2.7环境下的,还有这个bug 2019-07-21 17:51:24 INFO Traceback (most recent call last): File "", line 27, in print(table.head(3)) File "/usr/lib64/python2.7/site-packages/odps/df/expr/expressions.py", line 43, in call return self._func(args, kwargs) File "/usr/lib64/python2.7/site-packages/odps/df/expr/expressions.py", line 1058, in head return self._handle_delay_call('execute', self, head=n, kwargs) File "/usr/lib64/python2.7/site-packages/odps/df/expr/expressions.py", line 43, in call return self._func(args, kwargs) File "/usr/lib64/python2.7/site-packages/odps/df/expr/expressions.py", line 146, in _handle_delay_call result = getattr(engine, method)(*args, *kwargs) File "/usr/lib64/python2.7/site-packages/odps/df/backends/core.py", line 733, in execute return self._action(exprs_args_kwargs, kwargs) File "/usr/lib64/python2.7/site-packages/odps/df/backends/core.py", line 557, in _action timeout=timeout, progress_proportion=progress_proportion) File "/usr/lib64/python2.7/site-packages/odps/df/backends/core.py", line 798, in _execute_dag close_and_notify=close_and_notify, progress_proportion=progress_proportion) File "/usr/lib64/python2.7/site-packages/odps/df/backends/core.py", line 339, in execute results = self._run(ui, progress_proportion) File "/usr/lib64/python2.7/site-packages/odps/df/backends/core.py", line 213, in _run res = call(ui=ui, progress_proportion=progress_proportion / len(calls)) File "/usr/lib64/python2.7/site-packages/odps/df/backends/core.py", line 192, in call res = self.run(ui=ui, progress_proportion=progress_proportion) File "/usr/lib64/python2.7/site-packages/odps/df/backends/core.py", line 590, in run result = engine._do_execute(expr_dag, expr, *kw) File "/usr/lib64/python2.7/site-packages/odps/df/backends/odpssql/engine.py", line 348, in _do_execute group=group, libraries=libraries) File "/usr/lib64/python2.7/site-packages/odps/df/backends/odpssql/engine.py", line 159, in _run instance.wait_for_success() File "/usr/lib64/python2.7/site-packages/odps/models/instance.py", line 514, in wait_for_success raise exc ScriptError: ODPS-0123055: InstanceId: 20190721095050740gcte7c ODPS-0123055:Script exception - Traceback (most recent call last): File "", line 2302, in process for r in self.f(args, **self.kwargs): File "<>", line 15, in mapper TypeError: expected a character buffer object

joyzhou28 commented 5 years ago

程序到table = word_count.map_reduce(mapper, reducer, group='word')都可以执行,但这是lazy execution 打印计算结果就出现for r in self.f(*args, **self.kwargs): File "<>", line 15, in mapper TypeError: expected a character buffer object

wjsi commented 1 week ago

Works for me. Close as stale.