demo试用失败 - Githubissues

sgccnlp / ecws

电力领域中文分词模型 R3.0

http://sgccnlp.com

MIT License

24 stars 9 forks source link

demo试用失败 #3

Open jmzhoulab opened 3 years ago

jmzhoulab commented 3 years ago

from ecws.segment import Segmenter

model_path = 'ecws.model'

predict = Segmenter(model_path)

d = predict.seg(sent)

报错如下：

Traceback (most recent call last):
  File "/Users/zhoujm/workspace/python/kbqa4power/test/test.py", line 20, in <module>
    predict = Segmenter(model_path)
TypeError: __init__() missing 1 required positional argument: 'vocab_path'

缺少vocab_path，这是接口变了吗？另外vocab_path的内容格式是怎么样的？

liefficient commented 2 years ago

同样，init() missing 1 required positional argument: 'vocab_path'

campper commented 2 years ago

稍等，我看一下，尽快答复

ctrl-zzzzz commented 2 years ago

@liefficient @jmzhoulab 您好

目前vocab_path需要指向官方BertTokenizer的归档文件。具体操作如下：

from transformers import BertTokenizer
path = ‘path_to_save’

tokenizer = BertTokenizer.from_pretrained(‘bert-base-chinese’)
tokenizer.save_pretrained(path)

然后在接口调用的时候，将vocab_path指向path。

最近会更新一版代码，优化调用结构和在更大的语料上进行训练。