microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.23k stars 2.87k forks source link

Why I met Type 'seq(tensor(int64))' of operator (MemcpyFromHost) is invalid when using onnxruntime.InferenceSession() in GPU, and How to resolve it? On emergency hold,thanks! #10126

Open yuanhuachao opened 2 years ago

yuanhuachao commented 2 years ago

Describe the bug when I run exported onnx model of transformers (BARTBeamSearchGenerator model) on GPU. I met this. Who knows how can resolve it? Traceback (most recent call last): File "run_onnx_exporter.py", line 262, in <module> main() File "run_onnx_exporter.py", line 258, in main export_and_validate_model(model, tokenizer, output_name, num_beams, max_length, device) File "run_onnx_exporter.py", line 177, in export_and_validate_model ort_sess = onnxruntime.InferenceSession(new_onnx_file_path, providers=['CUDAExecutionProvider']) File "/home/venv/lib/python3.7/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 283, in __init__ self._create_inference_session(providers, provider_options, disabled_optimizers) File "/home/venv/lib/python3.7/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 321, in _create_inference_session sess.initialize_session(providers, provider_options, disabled_optimizers) onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Type Error: Type 'seq(tensor(int64))' of input parameter (best.1) of operator (MemcpyFromHost) in node (Memcpy_token_30) is invalid.

Urgency Is very Urgent

System information

harshithapv commented 2 years ago

Please share the steps for reproducing the error.

yuanhuachao commented 2 years ago

hi @harshithapv ,i run the example of Huggingface transformers(https://github.com/huggingface/transformers/tree/master/examples/onnx/pytorch/summarization), with command 'python run_onnx_exporter.py --model_name_or_path facebook/bart-base --device=cuda', in the directory of 'transformers/examples/onnx/pytorch/summarization'. And change the way to create onnxruntime.InferenceSession with CUDAExecutionProvider,( 'ort_sess = onnxruntime.InferenceSession(new_onnx_file_path, providers=['CUDAExecutionProvider']))'

hariharans29 commented 2 years ago

Not sure if it will fix all the issues but support for sequences on CUDA came after the 1.8.0 package you are using. As the first step, you will have to use 1.9 or preferably 1.10 to continue.

yuanhuachao commented 2 years ago

Not sure if it will fix all the issues but support for sequences on CUDA came after the 1.8.0 package you are using. As the first step, you will have to use 1.9 or preferably 1.10 to continue.

Hi @hariharans29 there's a new mistake(core dumps) when i use 1.10.0, like this https://github.com/huggingface/transformers/issues/14882, and do you know the version huggingface used when test the script

stale[bot] commented 2 years ago

This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.