Error when trying to run converted model; gpt2 tensorflow from onnx #591 follow up

Issue Type

Others

OS

Linux

onnx2tf version number

1.19.11

onnx version number

1.15.0

onnxruntime version number

1.16.3

onnxsim (onnx_simplifier) version number

0.4.33

tensorflow version number

2.14.0

Download URL for ONNX

https://huggingface.co/openai-community/gpt2/resolve/main/onnx/decoder_model.onnx?download=true

Parameter Replacement JSON

N/A

Description

To reproduce the error, can run the following notebook https://colab.research.google.com/drive/1pBnJpC4613cUoNl-k_rdHeGm3pnFCuds?usp=sharing
Search for " Converted TF model throws error " in the notebook above to find the error describe below

Purpose Research & Product Development. Thank you for the previous support in converting the model
What

I convert gpt2 onnx model to tensorflow by running !onnx2tf -i /content/model.onnx -b 1 -osd. Not able to run newly converted tf model

When running

# Load the tokenizer
tokenizer_name = 'gpt2'
tokenizer = GPT2Tokenizer.from_pretrained(tokenizer_name)

inputs = tokenizer(text, return_tensors='tf')

# Cast the input_ids and attention_mask to int64 to match the model's expected input types
input_ids = tf.cast(inputs["input_ids"], tf.int64)
attention_mask = tf.cast(inputs["attention_mask"], tf.int64)

# Load TensorFlow model from the SavedModel directory
model_path = '/content/saved_model'
model = tf.saved_model.load(model_path)

# Access the serving function from your loaded model
infer = model.signatures['serving_default']

# Run the model
# Ensure that the names of the input parameters match those expected by the model
output = infer(input_ids=input_ids, attention_mask=attention_mask)

I get the following error

---> 26 output = infer(input_ids=input_ids, attention_mask=attention_mask)
     27 
     28 # Print the output

8 frames
/usr/local/lib/python3.10/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     51   try:
     52     ctx.ensure_initialized()
---> 53     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
     54                                         inputs, attrs, num_outputs)
     55   except core._NotOkStatusException as e:

InvalidArgumentError: Graph execution error:

Detected at node model_39/tf.split/split defined at (most recent call last):
<stack traces unavailable>
Determined shape must either match input shape along split_dim exactly if fully specified, or be less than the size of the input along split_dim if not fully specified.  Got: 2304
     [[{{node model_39/tf.split/split}}]] [Op:__inference_signature_wrapper_386979]

How I tried padding the input to a fixed size 1024 or 2304 and did not help
Why To compare performance of in-browser inference when running transformer models using onnx runtime web, tensorflow js, web llm, etc
Resources

PINTO0309 / onnx2tf

Error when trying to run converted model; gpt2 tensorflow from onnx #591 follow up #595