BrikerMan / Kashgari

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
http://kashgari.readthedocs.io/
Apache License 2.0
2.39k stars 441 forks source link

[Question] Could the kashgari framework convert to onnx model? I try it but i find this bug #441

Closed Igoslow closed 3 years ago

Igoslow commented 3 years ago

You must follow the issue template and provide as much information as possible. otherwise, this issue will be closed. 请按照 issue 模板要求填写信息。如果没有按照 issue 模板填写,将会忽略并关闭这个 issue

Environment


[Paste requirements.txt file here]
kashgari                         1.0.0
kashgari-tf                      0.5.5
Keras                            2.3.1
Keras-Applications               1.0.8
keras-bert                       0.80.0
keras-embed-sim                  0.7.0
keras-gpt-2                      0.14.0
keras-layer-normalization        0.14.0
keras-multi-head                 0.22.0
keras-pos-embd                   0.11.0
keras-position-wise-feed-forward 0.6.0
Keras-Preprocessing              1.1.0
keras-self-attention             0.41.0
keras-transformer                0.31.0
keras2onnx                       1.6.0
kiwisolver                       1.1.0
Markdown                         3.1.1
MarkupSafe                       1.1.1
matplotlib                       3.1.2
mock                             3.0.5
modelarts                        1.1.3
netCDF4                          1.5.3
nltk                             3.4.5
numpy                            1.19.4
oauthlib                         3.1.0
onnx                             1.8.0
onnx-tf                          1.5.0
onnxconverter-common             1.7.0
onnxmltools                      1.7.0
onnxruntime                      1.4.0
onnxruntime-tools                1.4.0
onnxtk                           0.0.1
opencv-python                    4.2.0.32
opt-einsum                       3.1.0
packaging                        20.4
pandas                           0.25.3
Pillow                           7.0.0
pip                              20.1.1
prometheus-client                0.3.1
protobuf                         3.10.0
psutil                           5.4.6
py-cpuinfo                       7.0.0
py3nvml                          0.2.6
pyahocorasick                    1.4.0
pyasn1                           0.4.7
pyasn1-modules                   0.2.7
pycparser                        2.20
PyMySQL                          0.10.1
pyparsing                        2.4.6
pyreadline                       2.1
python-dateutil                  2.8.1
pytorch-pretrained-bert          0.6.1
pytz                             2019.3
PyYAML                           5.1.2
pyzmq                            20.0.0
regex                            2019.11.1
requests                         2.22.0
requests-oauthlib                1.2.0
rsa                              4.0
s3transfer                       0.2.1
sacremoses                       0.0.43
scikit-learn                     0.21.3
scipy                            1.4.1
sentencepiece                    0.1.91
seqeval                          0.0.10
setuptools                       41.6.0
six                              1.12.0
skl2onnx                         1.7.0
smart-open                       1.9.0
snownlp                          0.12.3
SQLAlchemy                       1.3.19
sqlparse                         0.3.1
tensorboard                      1.15.0
tensorflow                       1.15.0
tensorflow-addons                0.11.2
tensorflow-estimator             1.15.1
termcolor                        1.1.0
tf2onnx                          1.7.2
tokenizers                       0.8.1rc1
torch                            1.2.0+cpu
torchvision                      0.4.0+cpu
tqdm                             4.37.0
transformers                     2.0.0```

## Question
When i use onnx package, i find this bug like this ValueError: Unknown layer: TokenEmbedding.

[A clear and concise description of what you want to know.]
Igoslow commented 3 years ago

when i change the load model way, i find a bug in onnxmltools.convert_keras code as follows:

import onnxmltools from keras.models import load_model import kashgari input_keras_model = 'E:\KG\kashgari/train\gonghang_2020_12_10'

Change this path to the output name and path for the ONNX model output_onnx_model = 'kashgari_model.onnx'

Load your Keras model loaded_model = kashgari.utils.load_model(input_keras_model)

onnx_model = onnxmltools.convert_keras(loaded_model.tf_model)

Save as protobuf onnxmltools.utils.save_model(onnx_model, output_onnx_model)

This is a bug as follows: Traceback (most recent call last): File "E:/KG/kashgari/test_kashgari_onnx.py", line 26, in onnx_model = onnxmltools.convert_keras(loaded_model.tf_model) File "C:\Users\Learn\Envs\jiansuo1\lib\site-packages\onnxmltools\convert\main.py", line 33, in convert_keras return convert(model, name, doc_string, target_opset, channel_first_inputs) File "C:\Users\Learn\Envs\jiansuo1\lib\site-packages\keras2onnx\main.py", line 82, in convert_keras parse_graph(topology, tf_graph, target_opset, output_names) File "C:\Users\Learn\Envs\jiansuo1\lib\site-packages\keras2onnx\parser.py", line 828, in parse_graph return _parse_graph_core(graph, keras_layer_ts_map, topo, top_level, output_names) File "C:\Users\Learn\Envs\jiansuo1\lib\site-packages\keras2onnx\parser.py", line 746, in _parse_graph_core _on_parsing_tf_subgraph(graph, nodes, varset) File "C:\Users\Learn\Envs\jiansuo1\lib\site-packages\keras2onnx\parser.py", line 402, in _on_parsing_tf_subgraph subgraph, replacement = create_subgraph(graph, node_list, sess, operator.full_name) File "C:\Users\Learn\Envs\jiansuo1\lib\site-packages\keras2onnx\subgraph.py", line 170, in create_subgraph tf.import_graph_def(output_graph_def, name=im_scope) File "C:\Users\Learn\Envs\jiansuo1\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "C:\Users\Learn\Envs\jiansuo1\lib\site-packages\tensorflow_core\python\framework\importer.py", line 405, in import_graph_def producer_op_list=producer_op_list) File "C:\Users\Learn\Envs\jiansuo1\lib\site-packages\tensorflow_core\python\framework\importer.py", line 505, in _import_graph_def_internal raise ValueError(str(e)) ValueError: Input 0 of node TFNodes50/Embedding-Token/embedding_lookup was passed float from TFNodes50/Embedding-Token/embeddings:0 incompatible with expected resource.

Process finished with exit code 1

could you help me?

BrikerMan commented 3 years ago

I haven't tried the onnx, Maybe try to convert it on colab, so you can rule out the environment issue first.

Igoslow commented 3 years ago

Thank you very much! I get it.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.