Closed OriAlpha closed 2 years ago
I guess it is because external data. Previously, the encoder onnx and decoder onnx might be in different folder. After merging them into one ONNX model, now ORT start looking for the external data file in the same folder. If encoder and decoder have same external data filename. There will be confliction.
The solution is to make sure they use different files to store external data like:
mt5_encoder.onnx
mt5_encoder.onnx.data (This is the external data file for encoder)
mt5_decoder.onnx
mt5_decoder.onnx.data (This is the external data file for decoder)
To achieve that, we can use onnx API to save encoder/decoder ONNX model like
model=onnx.load_model("mt5_encoder.onnx", load_externa_data=True)
onnx.save_model("mt5_encoder.onnx", save_as_external_data=True, all_tensors_to_one_file=True, location="mt5_encoder.onnx.data", size_threshold=1024, convert_attribute=False)
After that, you can run convert_beam_search.py if needed.
I converted model as you specified, but still its gives out issue.
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Deserialize tensor onnx::MatMul_4720 failed.tensorprotoutils.cc:145 GetExternalDataInfo TensorProto: onnx::MatMul_4720 external data size mismatch. Computed size: 4194304, external_data.length: 11534336
Additional info decoder_init.onnx.data is around 3.5 GB.
If you want to try model please use https://huggingface.co/google/mt5-large/tree/main
I even tried testing in different systems, still it give same issue as
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Deserialize tensor onnx::MatMul_4498 failed.tensorprotoutils.cc:145 GetExternalDataInfo TensorProto external data size mismatch. Computed size: 11534336, external_data.length: 4194304
This happens only for decoder module, encoder and decoder_init session creates without any issues.
@tianleiwu do you suggest any way to fix these issues
@OriAlpha, you can try get the source of my pull request 11958 and run like the following for onnx conversion (need install nightly package or build wheel from source for model inference):
python convert_beam_search.py -m google/mt5-large --model_type mt5 --output mt5-large-beamsearch.onnx -e
I can run mt5_large model. There is still some parity issue (max diff are close to e-2) need PyTorch exporter team to take a look.
Thanks for looking, its working.
Describe the bug I have mt5 model, which is converted from pytorch to onnx and now i am trying to load a model into onnxruntime, i found out following error. Maybe due to limit specified in code somewhere. I was not able to find, any ideas to change could help me??
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Deserialize tensor onnx::MatMul_4622 failed.tensorprotoutils.cc:637 TensorProtoToTensor External initializer: onnx::MatMul_4622 offset: 0 size to read: 11534336 given file_length: 4194304 are out of bounds or can not be read in full.
System information