BrikerMan / Kashgari

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
http://kashgari.readthedocs.io/
Apache License 2.0
2.4k stars 441 forks source link

KeyError: '[PAD]' #435

Closed yuyunfeng666 closed 3 years ago

yuyunfeng666 commented 3 years ago

我想直接使用预训练模型参数试一下结果,可是报了个keyerror,这是为啥 from kashgari.tasks.labeling import BiLSTM_Model from kashgari.embeddings import BertEmbedding from kashgari.corpus import ChineseDailyNerCorpus

train_x, train_y = ChineseDailyNerCorpus.load_data('train') valid_x, valid_y = ChineseDailyNerCorpus.load_data('valid') test_x, test_y = ChineseDailyNerCorpus.load_data('test') print(train_y[0])

bert_embed = BertEmbedding('Kashgari/bert-base-chinese') model = BiLSTM_Model(bert_embed, sequence_length=100) model.evaluate(test_x, test_y)

BrikerMan commented 3 years ago

因为还没有构建模型和词表,所以会有问题,

...

bert_embed = BertEmbedding('Kashgari/bert-base-chinese')
# 只是初始化了模型对象
model = BiLSTM_Model(bert_embed, sequence_length=100)
# 需要增加这一行来构建模型和词表,你可以看 fit 方法,训练时候也是第一步做这个,
model.build_model(train_x, train_y)
# build 后就可以 evaluate 和 predict 了
model.evaluate(test_x, test_y)
yuyunfeng666 commented 3 years ago

首先十分感谢您的回答,这部分已经调通了,我还想问一下,我这里如果想使用别的权重矩阵,只需要修改BertEmbedding('Kashgari/bert-base-chinese')这一部分吗

BrikerMan commented 3 years ago

首先十分感谢您的回答,这部分已经调通了,我还想问一下,我这里如果想使用别的权重矩阵,只需要修改BertEmbedding('Kashgari/bert-base-chinese')这一部分吗

是的

yuyunfeng666 commented 3 years ago

好的,谢谢您

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.