lpty / nlp_base

自然语言基础模型
563 stars 204 forks source link

Xgboost 的中文疑问句判别模型中读取配置文件转换 json 出错 #18

Open jluncc opened 5 years ago

jluncc commented 5 years ago

你好,我下载你的代码学习过程中,运行 /interrogative/manage.py 出现报错:

Traceback (most recent call last):
  File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1758, in <module>
    main()
  File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1752, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1147, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/jinglun/PycharmProjects/DownloadProjects/nlp_base/interrogative/manage.py", line 4, in <module>
    train()
  File "/Users/jinglun/PycharmProjects/DownloadProjects/nlp_base/interrogative/src/api.py", line 17, in train
    model.train()
  File "/Users/jinglun/PycharmProjects/DownloadProjects/nlp_base/interrogative/src/model.py", line 81, in train
    self.initialize_model()
  File "/Users/jinglun/PycharmProjects/DownloadProjects/nlp_base/interrogative/src/model.py", line 40, in initialize_model
    self.max_depth = to_json(self.config.get('model', 'max_depth'))
  File "/Users/jinglun/PycharmProjects/DownloadProjects/nlp_base/interrogative/src/util.py", line 16, in to_json
    return demjson.decode(text, encoding='utf-8')
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/site-packages/demjson.py", line 5699, in decode
    return_stats=(return_stats or write_stats) )
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/site-packages/demjson.py", line 4915, in decode
    raise errors[0]
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/site-packages/demjson.py", line 2428, in set_input
    self.buf = buffered_stream( txt, encoding=encoding )
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/site-packages/demjson.py", line 1614, in __init__
    self.set_text( txt, encoding )
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/site-packages/demjson.py", line 1685, in set_text
    raise newerr
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/site-packages/demjson.py", line 1675, in set_text
    decoded = helpers.unicode_decode( txt, encoding )
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/site-packages/demjson.py", line 1256, in unicode_decode
    unitxt, numbytes = cdk.decode( txt, **cdk_kw )  # DO THE DECODE HERE!
  File "/Users/jinglun/software/miniconda2/envs/nlp_base36/lib/python3.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
demjson.JSONDecodeError: a Unicode decoding error occurred

具体代码行是如下 /interrogative/model.py 下这行:

self.max_depth = to_json(self.config.get('model', 'max_depth'))

我的 config.py 中 model 配置没有改动,如下:

'model': {
                'max_depth': [4, 5, 6],
                'eta': [0.1, 0.05, 0.02],
                'subsample': [0.5, 0.7, 1.0],
                'max_iterations': 100,
                'objective': ['binary:logistic'],
                'silent': [1],
                'num_boost_round': 2000,
                'nfold': 5,
                'stratified': 1,
                'metrics': 'auc',
                'early_stopping_rounds': 50,
                'model_path': ' src/data/{}.model'
            }

不是很明白为什么会报这个错误,网上搜索也没有找到解决方法,请教一下这个可以怎么解决吗?

lxinghai007 commented 5 years ago

to_json 里面 decode改成 return text