Closed oborchers closed 3 years ago
Thank you for this excellent report, @oborchers - I'll investigate and report back.
Fixed in https://github.com/huggingface/transformers/pull/9736
But found another problem: https://github.com/huggingface/transformers/issues/9737. Fixed in https://github.com/huggingface/transformers/pull/9738
So you will need both PRs for your task to work in case you want to try before they are merged.
Awesome! Thank you, @stas00! Looking forward to try it out after PRs have been merged. Much appreciated
The problem you reported has been fixed in https://github.com/huggingface/transformers/pull/9736 (merged already)
But then another one poped up in https://github.com/huggingface/transformers/issues/9737
You can just use the https://github.com/huggingface/transformers/pull/9738 branch - since it contains both fixes.
Not sure how quickly it will get merged, since we might want to solve this for other models too. I made only a local for fsmt fix in that PR.
Great, thank you for the fast response and issue handling. I will provide a followup on #9738. While export works as intended, there is an issue I encounter while running the following code (built on 1st example):
sess = rt.InferenceSession(str(Path("encoder/en_de_trans.onnx")), opt)
spans = [
"My name is Bert", # Succeeds
"My name is Bert and" # Fails
]
for span in spans:
model_input = nlp.tokenizer.encode_plus(span)
model_input = {name : np.atleast_2d(value) for name, value in model_input.items()}
out = nlp.model(**nlp.tokenizer(span, return_tensors="pt"))
trans_1 = out[0].detach().cpu().numpy()
trans_2 = out[1].detach().cpu().numpy()
onnx_1, onnx_2 = sess.run(None, model_input)
assert np.allclose(trans_1, onnx_1, atol=1e-5)
assert np.allclose(trans_2, onnx_2, atol=1e-5)
"My name is Bert and" will raise:
---------------------------------------------------------------------------
RuntimeException Traceback (most recent call last)
<ipython-input-3-3ef2da9bdd5e> in <module>
10 trans_1 = out[0].detach().cpu().numpy()
11 trans_2 = out[1].detach().cpu().numpy()
---> 12 onnx_1, onnx_2 = sess.run(None, model_input)
13 assert np.allclose(trans_1, onnx_1, atol=1e-5)
14 assert np.allclose(trans_2, onnx_2, atol=1e-5)
~/anaconda3/envs/dev/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py in run(self, output_names, input_feed, run_options)
122 output_names = [output.name for output in self._outputs_meta]
123 try:
--> 124 return self._sess.run(output_names, input_feed, run_options)
125 except C.EPFail as err:
126 if self._enable_fallback:
RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Reshape node. Name:'Reshape_74' Status Message: /data/shared/packages/onnxruntime/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:43 onnxruntime::ReshapeHelper::ReshapeHelper(const onnxruntime::TensorShape&, std::vector<long int>&) gsl::narrow_cast<int64_t>(input_shape.Size()) == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{1,6}, requested shape:{5}
Solely based on intuition I'd assume that some dynamic shape of was not inferred properly/not passed to the dynamic_shapes of torch.onnx.export. But thats just a quick guess. Or did I miss something?
I see that I would have to look-into/re-implement the generate function, as only the tensors are passed back. I'm going to create a feature suggestion to support the ORT Custom Ops. Perhaps It would be possible to retrieve the actual translated string in the far future, instead of the tensors (or specify the output).
As promised follow up feature request + suggestion under #9784
Honestly, I don't know much about the ONNX-side of things. I asked @mfuntowicz to hopefully have a look and address this.
Also tagging @LysandreJik and @patrickvonplaten who perhaps may have some answers as well.
I wonder if this is an issue project-wise, e.g. do you have the same issue if you do this on a Bart model? I'm asking since fsmt is Bart with some tweaks.
Also I think it's best to open a new issue, since now we are dealing with a different issue, so it'd be easier to track and monitor.
Thank you for your help, @stas00! I followed your advice and created a new issue.
@oborchers It seems that it is a problem of the pythorch export of the dynamic_axes. Using the nightly version (torch-1.9.0.dev20210212 + cpu) it works.
On the other hand, I am interested in using the onnx models to generate, (translate and summarize). Could you give me some indication of how to do a custom forward using the onnx model, to use in the generation_utils.generate function.
PS: for what you comment here 9784 you plan to work on a User-specific re-implementation. Thanks
Environment info
transformers
version: 4.2.2Who can help
@mfuntowicz (based on initial commit of convert_graph_to_onnx) @stas00 (based on model used here) @thomwolf (based on history)
Information
Model I am using (Bert, XLNet ...): facebook/wmt19-en-de
The problem arises when using:
The tasks I am working on is:
To reproduce
Steps to reproduce the behavior:
Raises:
Subsequently, the call of the raise can be boiled down to inferring the shapes for torch.onnx.export
I think that may be due to the incompatibility of the tokenizer() vs tokenizer.encode() for this very model.
Expected behavior
Model export should work properly.