Open Igoslow opened 3 years ago
please help me, thank you
@Igoslow , could you add more information of your issue? Are you converting a TF or PyTorch Model? For model conversion, tf2onnx and PyTorch Exporter are better place.
Thank you for your reply. I convert a TF model of bert. The my source code are below.
enable_overwrite = False total_runs = 100 max_sequence_length = 512 import os cache_dir = './cache_models' output_dir = './onnx_models' for directory in [cache_dir, output_dir]: if not os.path.exists(directory): os.makedirs(directory) from transformers import BertTokenizer model_name_or_path = "bert-base-cased"
tokenizer = BertTokenizer.from_pretrained(model_name_or_path, do_lower_case=True, cache_dir=cache_dir) from transformers.modeling_bert import BertForPreTraining#,BertForQuestionAnswering,BertModel from transformers.modeling_bert import BertConfig config = BertConfig.from_json_file('./chinese_L-12_H-768_A-12/bert_config.json') model = BertForPreTraining.from_pretrained('./chinese_L-12_H-768_A-12/bert_model.ckpt.index', config=config, from_tf=True)
model._saved_model_inputs_spec = None
import time import keras2onnx output_model_path = os.path.join(outputdir, 'keras{}.onnx'.format(model_name_or_path))
if enable_overwrite or not os.path.exists(output_model_path): start = time.time() onnx_model = keras2onnx.convert_keras(model, 'Bert', target_opset=10) #model.name keras2onnx.save_model(onnx_model, output_model_path) print("Keras2onnx run time = {} s".format(format(time.time() - start, '.2f')))
Hi @Igoslow,
Morgan from Hugging Face team here 👋.
I suspect you're combining two things incompatible:
config = BertConfig.from_json_file('./chinese_L-12_H-768_A-12/bert_config.json')
model = BertForPreTraining.from_pretrained('./chinese_L-12_H-768_A-12/bert_model.ckpt.index', config=config, from_tf=True)
=> This will give you a PyTorch module you can leverage to do inference/training with
onnx_model = keras2onnx.convert_keras(model, 'Bert', target_opset=10) #model.name
keras2onnx.save_model(onnx_model, output_model_path)
Possible solution:
If you want to make it works without too much change in your code, you should replace:
from transformers.modeling_bert import TFBertForPreTraining#,BertForQuestionAnswering,BertModel
model = TFBertForPreTraining.from_pretrained('./chinese_L-12_H-768_A-12/bert_model.ckpt.index', config=config)
By doing this, you'll be loading the checkpoint within a TensorFlow object which will be handled correctly by keras2onnx
.
Thank you very much. I almost understand. I have a try for your advice. But I find other problems as blow.
Traceback (most recent call last):
File "E:/roberta-bert-finetune/bert_tensorflow_onnx.py", line 24, in
Yes, I think keras2onnx
needs some more steps, especially it needs some inputs to trace the model.
Can you can something like this:
tokenizer = BertTokenizer.from_pretrained("<path/to/your/tokenizer>")
tokens = tokenizer("Some input data")
model.model.predict(tokens.data)
onnx_model = convert_keras(model.model, model.model.name, target_opset=10)
I follow your advice but I get same problem that NotImplementedError: Weights may only be loaded based on topology into Models when loading TensorFlow-formatted weights (got by_name=True to load_weights). My code are as below.
enable_overwrite = False total_runs = 100 max_sequence_length = 512 import os cache_dir = './cache_models' output_dir = './onnx_models' for directory in [cache_dir, output_dir]: if not os.path.exists(directory): os.makedirs(directory) from transformers import (TFBertForPreTraining,BertConfig,BertTokenizer,BertForPreTraining,TFBertModel, TFBertForQuestionAnswering, BertTokenizer) tokenizer = BertTokenizer.from_pretrained('./chinese_L-12_H-768_A-12') model_name_or_path = "orin_model/bert_model.ckpt.index" config = BertConfig.from_json_file('./chinese_L-12_H-768_A-12/bert_config.json') model = TFBertForPreTraining.from_pretrained(model_name_or_path, config=config)#,from_pt=True) model._saved_model_inputs_spec = None model.saved_model_inputs_spec = None import time import keras2onnx output_model_path = os.path.join(output_dir, 'keras{}.onnx'.format(model_name_or_path)) if enable_overwrite or not os.path.exists(output_model_path): start = time.time() onnx_model = keras2onnx.convert_keras(model, model.name, target_opset=10)#model.name) keras2onnx.save_model(onnx_model, output_model_path) print("Keras2onnx run time = {} s".format(format(time.time() - start, '.2f')))
the main of environment are as follow. Keras 2.1.6 keras2onnx 1.6.0 onnx 1.5.0 onnxconverter-common 1.7.0 onnxruntime 1.5.1 onnxruntime-tools 1.4.0 tensorflow 2.3.0 tensorflow-estimator 2.3.0 termcolor 1.1.0 tokenizers 0.8.1rc1 torch 1.6.0+cpu torchvision 0.7.0+cpu
Describe the bug A clear and concise description of what the bug is. could the checkpoint of bert convert to onnx model? I have a bug that 'BertForPreTraining' object has no attribute 'layers, output'
Urgency If there are particular important use cases blocked by this or strict project-related timelines, please share more information and dates. If there are no hard deadlines, please specify none.
System information
To Reproduce
Expected behavior A clear and concise description of what you expected to happen.
Screenshots If applicable, add screenshots to help explain your problem.
Additional context Add any other context about the problem here. If the issue is about a particular model, please share the model details as well to facilitate debugging.