PINTO0309 / onnx2tf

Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.
MIT License
695 stars 73 forks source link

Error when trying to run converted model; gpt2 tensorflow from onnx #591 follow up #595

Closed flores-o closed 7 months ago

flores-o commented 7 months ago

Issue Type

Others

OS

Linux

onnx2tf version number

1.19.11

onnx version number

1.15.0

onnxruntime version number

1.16.3

onnxsim (onnx_simplifier) version number

0.4.33

tensorflow version number

2.14.0

Download URL for ONNX

https://huggingface.co/openai-community/gpt2/resolve/main/onnx/decoder_model.onnx?download=true

Parameter Replacement JSON

N/A

Description

  1. Purpose Research & Product Development. Thank you for the previous support in converting the model
  2. What

I convert gpt2 onnx model to tensorflow by running !onnx2tf -i /content/model.onnx -b 1 -osd. Not able to run newly converted tf model

When running

# Load the tokenizer
tokenizer_name = 'gpt2'
tokenizer = GPT2Tokenizer.from_pretrained(tokenizer_name)

inputs = tokenizer(text, return_tensors='tf')

# Cast the input_ids and attention_mask to int64 to match the model's expected input types
input_ids = tf.cast(inputs["input_ids"], tf.int64)
attention_mask = tf.cast(inputs["attention_mask"], tf.int64)

# Load TensorFlow model from the SavedModel directory
model_path = '/content/saved_model'
model = tf.saved_model.load(model_path)

# Access the serving function from your loaded model
infer = model.signatures['serving_default']

# Run the model
# Ensure that the names of the input parameters match those expected by the model
output = infer(input_ids=input_ids, attention_mask=attention_mask)

I get the following error

---> 26 output = infer(input_ids=input_ids, attention_mask=attention_mask)
     27 
     28 # Print the output

8 frames
/usr/local/lib/python3.10/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     51   try:
     52     ctx.ensure_initialized()
---> 53     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
     54                                         inputs, attrs, num_outputs)
     55   except core._NotOkStatusException as e:

InvalidArgumentError: Graph execution error:

Detected at node model_39/tf.split/split defined at (most recent call last):
<stack traces unavailable>
Determined shape must either match input shape along split_dim exactly if fully specified, or be less than the size of the input along split_dim if not fully specified.  Got: 2304
     [[{{node model_39/tf.split/split}}]] [Op:__inference_signature_wrapper_386979]
  1. How I tried padding the input to a fixed size 1024 or 2304 and did not help

  2. Why To compare performance of in-browser inference when running transformer models using onnx runtime web, tensorflow js, web llm, etc

  3. Resources

Screenshot 2024-03-30 at 3 42 01 PM
github-actions[bot] commented 7 months ago

If there is no activity within the next two days, this issue will be closed automatically.