ELS-RD / transformer-deploy

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
https://els-rd.github.io/transformer-deploy/
Apache License 2.0
1.66k stars 151 forks source link

Error Nodes in a graph must be topologically sorted, however input 'encoder_hidden_states' of node: name: MatMul_173_input_cast0 OpType: Cast is not output of any previous nodes #97

Open dheerajiiitv opened 2 years ago

dheerajiiitv commented 2 years ago
---------------------------------------------------------------------------
InvalidGraph                              Traceback (most recent call last)
Input In [22], in <cell line: 6>()
      4 pytorch_model = pytorch_model.eval()
      5 model_decoder = model_decoder.eval()
----> 6 dec_onnx = create_model_for_provider(dec_if_fp16_model_path, "CUDAExecutionProvider", log_severity=3)
      7 dec_onnx_binding: IOBinding = dec_onnx.io_binding()

File ~/onnxvenv/lib/python3.9/site-packages/transformer_deploy/backends/ort_utils.py:82, in create_model_for_provider(path, provider_to_use, nb_threads, nb_instances, optimization_level, enable_profiling, log_severity)
     80     if nb_instances > 1:
     81         options.inter_op_num_threads = nb_instances
---> 82 return InferenceSession(path, options, providers=provider_to_use)

File ~/dheeraj/onnx_experiments/onnxruntime/build/Linux/Release/onnxruntime/capi/onnxruntime_inference_collection.py:344, in InferenceSession.__init__(self, path_or_bytes, sess_options, providers, provider_options, **kwargs)
    341 disabled_optimizers = kwargs["disabled_optimizers"] if "disabled_optimizers" in kwargs else None
    343 try:
--> 344     self._create_inference_session(providers, provider_options, disabled_optimizers)
    345 except ValueError:
    346     if self._enable_fallback:

File ~/dheeraj/onnx_experiments/onnxruntime/build/Linux/Release/onnxruntime/capi/onnxruntime_inference_collection.py:381, in InferenceSession._create_inference_session(self, providers, provider_options, disabled_optimizers)
    379 session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    380 if self._model_path:
--> 381     sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
    382 else:
    383     sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)

InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from ./test-dec-if/model_fp16.onnx failed:This is an invalid model. In Node, ("", If, "", -1) : ("enable_cache": tensor(bool),) -> ("logits": tensor(float16),"present.0.decoder.key": tensor(float16),"present.0.decoder.value": tensor(float16),"present.0.encoder.key": tensor(float16),"present.0.encoder.value": tensor(float16),"present.1.decoder.key": tensor(float16),"present.1.decoder.value": tensor(float16),"present.1.encoder.key": tensor(float16),"present.1.encoder.value": tensor(float16),"present.2.decoder.key": tensor(float16),"present.2.decoder.value": tensor(float16),"present.2.encoder.key": tensor(float16),"present.2.encoder.value": tensor(float16),"present.3.decoder.key": tensor(float16),"present.3.decoder.value": tensor(float16),"present.3.encoder.key": tensor(float16),"present.3.encoder.value": tensor(float16),"present.4.decoder.key": tensor(float16),"present.4.decoder.value": tensor(float16),"present.4.encoder.key": tensor(float16),"present.4.encoder.value": tensor(float16),"present.5.decoder.key": tensor(float16),"present.5.decoder.value": tensor(float16),"present.5.encoder.key": tensor(float16),"present.5.encoder.value": tensor(float16),) , Error Nodes in a graph must be topologically sorted, however input 'encoder_hidden_states' of node: 
name: MatMul_173_input_cast0 OpType: Cast
 is not output of any previous nodes.

Hey Team, I am setup all the libraries needed to convert the T5 model to ONNX, and it ran successfully. Thanks for excellent code. Now I want to try to convert other encoder decoder architecture using the same notebook, and I was successful in converting MarianMT Model all three model encoder, decoder, decoder_cache to onnx using the same code in T5 with some changes, but merging decoders with cache and non cache is also successful but loading them on inference gives me the above error. I need your help in solving this issue. Please take a look. PS: I am able to load encoder, decoder and decoder with cache separately. Let me know if you need any more information.

pommedeterresautee commented 2 years ago

Have you a specific model in mind? We have updated the code project and addressed a similar issue. It would help if you can share reproductible code.