Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
You must follow the issue template and provide as much information as possible. otherwise, this issue will be closed.
请按照 issue 模板要求填写信息。如果没有按照 issue 模板填写,将会忽略并关闭这个 issue
Check List
Thanks for considering to open an issue. Before you submit your issue, please confirm these boxes are checked.
You can post pictures, but if specific text or code is required to reproduce the issue, please provide the text in a plain text format for easy copy/paste.
[√] I have searched in existing issues but did not find the same one.
模型fit的时候,报错:layer_crf does not support masking, but was passed an input_mask: Tensor("non_masking_layer/Identity_1:0"
很奇怪,kashgari里面的代码都没改动。
我看,就BertEmbedding里面有个自定义的NonMaskingLayer。我把它注释了,直接bert的embed_model.output,模型fit也报错:layer_crf does not support masking, but was passed an input_mask: Tensor("Encoder-Output/All:0"
kashgari.embedding.bert_embedding.py:
`def _build_model(self, **kwargs):
if self.embed_model is None:
seq_len = self.sequence_length
if isinstance(seq_len, tuple):
seq_len = seq_len[0]
if isinstance(seq_len, str):
logging.warning(f"Model will be built until sequence length is determined")
return
config_path = os.path.join(self.model_folder, 'bert_config.json')
check_point_path = os.path.join(self.model_folder, 'bert_model.ckpt')
bert_model = keras_bert.load_trained_model_from_checkpoint(config_path,
check_point_path,
seq_len=seq_len,
output_layer_num=self.layer_nums,
training=self.training,
trainable=self.trainable)
You must follow the issue template and provide as much information as possible. otherwise, this issue will be closed. 请按照 issue 模板要求填写信息。如果没有按照 issue 模板填写,将会忽略并关闭这个 issue
Check List
Thanks for considering to open an issue. Before you submit your issue, please confirm these boxes are checked.
You can post pictures, but if specific text or code is required to reproduce the issue, please provide the text in a plain text format for easy copy/paste.
Environment
各种包的版本应该都没问题的。
Issue Description
只是简单测试一个中文序列标注任务,调用BiLSTM_CRF模型,使用BertEmbedding来加载bert_chinese_base。
模型fit的时候,报错:layer_crf does not support masking, but was passed an input_mask: Tensor("non_masking_layer/Identity_1:0" 很奇怪,kashgari里面的代码都没改动。 我看,就BertEmbedding里面有个自定义的NonMaskingLayer。我把它注释了,直接bert的embed_model.output,模型fit也报错:layer_crf does not support masking, but was passed an input_mask: Tensor("Encoder-Output/All:0"
kashgari.embedding.bert_embedding.py: `def _build_model(self, **kwargs): if self.embed_model is None: seq_len = self.sequence_length if isinstance(seq_len, tuple): seq_len = seq_len[0] if isinstance(seq_len, str): logging.warning(f"Model will be built until sequence length is determined") return config_path = os.path.join(self.model_folder, 'bert_config.json') check_point_path = os.path.join(self.model_folder, 'bert_model.ckpt') bert_model = keras_bert.load_trained_model_from_checkpoint(config_path, check_point_path, seq_len=seq_len, output_layer_num=self.layer_nums, training=self.training, trainable=self.trainable)
不知道哪里有问题么?直接简单调用这个模型,也运行不起来T_T。