BrikerMan / Kashgari

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
http://kashgari.readthedocs.io/
Apache License 2.0
2.4k stars 441 forks source link

使用albert做嵌入的时候报错 #486

Open hwq458362228 opened 2 years ago

hwq458362228 commented 2 years ago

You must follow the issue template and provide as much information as possible. otherwise, this issue will be closed. 请按照 issue 模板要求填写信息。如果没有按照 issue 模板填写,将会忽略并关闭这个 issue

Check List

Thanks for considering to open an issue. Before you submit your issue, please confirm these boxes are checked.

You can post pictures, but if specific text or code is required to reproduce the issue, please provide the text in a plain text format for easy copy/paste.

Environment



## Question

运行了kashgari 2.0文档中使用albert嵌入的例子程序,albert模型是在官网上下载的albert_base_zh_additional_36k_steps,也试过别的albert模型,但是都会报同一个错;使用bert是没问题的,不明白是为什么

022-04-06 22:03:26,896 [DEBUG] kashgari - ------------------------------------------------
2022-04-06 22:03:26,896 [DEBUG] kashgari - Loaded transformer model's vocab
2022-04-06 22:03:26,896 [DEBUG] kashgari - config_path       : E://文本分类/albert_base_zh_additional_36k_steps\albert_config_base.json
2022-04-06 22:03:26,896 [DEBUG] kashgari - vocab_path      : E://文本分类/albert_base_zh_additional_36k_steps\vocab.txt
2022-04-06 22:03:26,896 [DEBUG] kashgari - checkpoint_path : E://文本分类/albert_base_zh_additional_36k_steps\albert_model.ckpt
2022-04-06 22:03:26,896 [DEBUG] kashgari - Top 50 words    : ['[PAD]', '[unused1]', '[unused2]', '[unused3]', '[unused4]', '[unused5]', '[unused6]', '[unused7]', '[unused8]', '[unused9]', '[unused10]', '[unused11]', '[unused12]', '[unused13]', '[unused14]', '[unused15]', '[unused16]', '[unused17]', '[unused18]', '[unused19]', '[unused20]', '[unused21]', '[unused22]', '[unused23]', '[unused24]', '[unused25]', '[unused26]', '[unused27]', '[unused28]', '[unused29]', '[unused30]', '[unused31]', '[unused32]', '[unused33]', '[unused34]', '[unused35]', '[unused36]', '[unused37]', '[unused38]', '[unused39]', '[unused40]', '[unused41]', '[unused42]', '[unused43]', '[unused44]', '[unused45]', '[unused46]', '[unused47]', '[unused48]', '[unused49]']
2022-04-06 22:03:26,896 [DEBUG] kashgari - ------------------------------------------------
Preparing text vocab dict: 100%|██████████| 2/2 [00:00<?, ?it/s]
Preparing text vocab dict: 100%|██████████| 1/1 [00:00<?, ?it/s]
2022-04-06 22:03:26,912 [DEBUG] kashgari - --- Build vocab dict finished, Total: 5 ---
2022-04-06 22:03:26,912 [DEBUG] kashgari - Top-10: ['[PAD]', '[UNK]', '[CLS]', '[SEP]', 'the']
Preparing classification label vocab dict: 100%|██████████| 2/2 [00:00<?, ?it/s]
Preparing classification label vocab dict: 100%|██████████| 1/1 [00:00<?, ?it/s]
Traceback (most recent call last):

  File "C:\Users\hwq45\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\py_checkpoint_reader.py", line 70, in get_tensor
    self, compat.as_bytes(tensor_str))

RuntimeError: Key bert/encoder/embedding_hidden_mapping_in/kernel not found in checkpoint

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "E:\文本分类\个人情感\csv\albert_test.py", line 55, in <module>
    batch_size=32

  File "C:\Users\hwq45\anaconda3\envs\tensorflow\lib\site-packages\kashgari\tasks\classification\abc_model.py", line 208, in fit
    fit_kwargs=fit_kwargs)

  File "C:\Users\hwq45\anaconda3\envs\tensorflow\lib\site-packages\kashgari\tasks\classification\abc_model.py", line 240, in fit_generator
    self.build_model_generator([g for g in [train_sample_gen, valid_sample_gen] if g])

  File "C:\Users\hwq45\anaconda3\envs\tensorflow\lib\site-packages\kashgari\tasks\classification\abc_model.py", line 114, in build_model_generator
    self.embedding.setup_text_processor(self.text_processor)

  File "C:\Users\hwq45\anaconda3\envs\tensorflow\lib\site-packages\kashgari\embeddings\abc_embedding.py", line 63, in setup_text_processor
    self.build_embedding_model(vocab_size=processor.vocab_size)

  File "C:\Users\hwq45\anaconda3\envs\tensorflow\lib\site-packages\kashgari\embeddings\transformer_embedding.py", line 92, in build_embedding_model
    return_keras_model=True)

  File "C:\Users\hwq45\anaconda3\envs\tensorflow\lib\site-packages\bert4keras\models.py", line 2717, in build_transformer_model
    transformer.load_weights_from_checkpoint(checkpoint_path)

  File "C:\Users\hwq45\anaconda3\envs\tensorflow\lib\site-packages\bert4keras\models.py", line 310, in load_weights_from_checkpoint
    raise e

  File "C:\Users\hwq45\anaconda3\envs\tensorflow\lib\site-packages\bert4keras\models.py", line 304, in load_weights_from_checkpoint
    values.append(self.load_variable(checkpoint, v))

  File "C:\Users\hwq45\anaconda3\envs\tensorflow\lib\site-packages\bert4keras\models.py", line 707, in load_variable
    variable = super(BERT, self).load_variable(checkpoint, name)

  File "C:\Users\hwq45\anaconda3\envs\tensorflow\lib\site-packages\bert4keras\models.py", line 275, in load_variable
    return tf.train.load_variable(checkpoint, name)

  File "C:\Users\hwq45\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\checkpoint_utils.py", line 85, in load_variable
    return reader.get_tensor(name)

  File "C:\Users\hwq45\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\py_checkpoint_reader.py", line 74, in get_tensor
    error_translator(e)

  File "C:\Users\hwq45\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\py_checkpoint_reader.py", line 35, in error_translator
    raise errors_impl.NotFoundError(None, None, error_message)

NotFoundError: Key bert/encoder/embedding_hidden_mapping_in/kernel not found in checkpoint
stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.