Error while converting distilbart-mnli-12-1 model to ONNX

farzanehnakhaee70 commented 2 years ago

After converting distilbart-mnli-12-1 to ONNX, while testing the onnx model, I get this issue:

onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: \[ONNXRuntimeError\] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running Expand node. Name:'Expand_74' Status Message: invalid expand shape

After lots of investigation I understand that the problem is existed with shift_tokens_right function in modeling_bart.py code.

I edit the function to this:

def shift_tokens_right(input_ids: torch.Tensor, pad_token_id: int, decoder_start_token_id: int):
    """
    Shift input ids one token to the right.
    """
    shifted_input_ids = input_ids.new_zeros(input_ids.shape)
    shifted_input_ids[:, 1:] = input_ids[:, :-1].clone()
    shifted_input_ids[:, 0] = torch.full((input_ids.shape[0],),decoder_start_token_id)

    assert pad_token_id is not None, "self.model.config.pad_token_id has to be defined."
    # # replace possible -100 values in labels by `pad_token_id`
    shifted_input_ids.masked_fill_(shifted_input_ids == -100, pad_token_id)

    return shifted_input_ids

Problem totally solved. The issue is existed with ONNX converter where do not perform correctly while there is broadcasting.

Is it possible to edit the repository and merge these changes to yours?

patil-suraj commented 2 years ago

cc @lewtun @michaelbenayoun

lewtun commented 2 years ago

Hi @farzanehnakhaee70 and thank you for raising this issue!

FYI we recently merged a major overhaul of the ONNX export for BART in #14700 which we've tested for various topologies / tasks, e.g. this works:

# Install from source with extra ONNX dependencies
pip install 'git+https://github.com/huggingface/transformers#egg=transformers[onnx]'
# Export model with default features (i.e. just `BartModel`)
python -m transformers.onnx --model=valhalla/distilbart-mnli-12-1 onnx/

Does installing from master solve your problem? If not, can you please provide the explicit command you are using to export the model?

farzanehnakhaee70 commented 2 years ago

Welcome and thanks a lot for your consideration! I see your major changes in your configuration which largely improves usability for other tasks. But this error will not be solved without changing the code as I mentioned.

The major issue is that although we add dynamic_axis in conversion script, but due to the error of broadcasting, the output of this function became fixed with regard to the batch_size of the dummy input. Therefore, when running the model after conversion with batch size different from the batch size of the dummy input, this error will raise.

lewtun commented 2 years ago

Thank you for the extra context about the batch size :)

However, I am not able to reproduce the problem you reported. For example, suppose we export the model using the command I used in my previous comment:

# Export model with default features (i.e. just `BartModel`)
python -m transformers.onnx --model=valhalla/distilbart-mnli-12-1 onnx/

We can then load this model into an ONNX Runtime InferenceSession as follows:

from transformers import AutoTokenizer, AutoModel

model_ckpt = "valhalla/distilbart-mnli-12-1"
tokenizer = AutoTokenizer.from_pretrained(model_ckpt)

bs = 16 # batch size
ort_session = ort.InferenceSession("onnx/model.onnx")
onnx_named_outputs = ["last_hidden_state"]

inputs = tokenizer(["Hello, my name is Lewis"] * bs, return_tensors="np")
decoder_inputs = tokenizer(["Hello"] * bs, return_tensors="np")
all_inputs = {
    "input_ids": inputs["input_ids"],
    "attention_mask": inputs["attention_mask"],
    "decoder_input_ids": decoder_inputs["input_ids"],
    "decoder_attention_mask": decoder_inputs["attention_mask"],
}
onnx_outputs = ort_session.run(onnx_named_outputs, all_inputs)

This runs without error using the source install of transformers. For comparison, we can find the batch size used in the dummy inputs during the conversion as follows:

from transformers.models.bart import BartConfig, BartOnnxConfig

config = BartConfig.from_pretrained(model_ckpt)
onnx_config = BartOnnxConfig(config)
dummy_inputs = onnx_config.generate_dummy_inputs(tokenizer, framework=TensorType.NUMPY)
# Returns (batch_size, seq_len) = (2,8)
dummy_inputs["input_ids"].shape

So you can see that the dummy inputs have a batch size of 2, while the inference example I created uses a batch size of 16.

Could you please share a minimal reproducible example with the problem you're facing (e.g. a Colab notebook)?

farzanehnakhaee70 commented 2 years ago

Thanks a lot for your complete consideration. I convert one model for sentence classification task and it doesn't have any decoder_input_ids and decoder_attention_mask as input. The only inputs are input_ids and attentio_mask which is shown by netron. If these inputs are availabe for the model, then we do not have any problem because the shift_tokens_right function will no be used any more. Would you please tell me how I can convert my model that these two inputs are also defined as the input (the same as what you have done)?

lewtun commented 2 years ago

Ah, now I am able to reproduce the problem - the missing step was to specify explicitly that we should use the sequence-classification feature 😄

For example, the following fails:

import onnxruntime as ort
from transformers import AutoTokenizer, AutoModel

# Export the model with the `sequence-classification` topology
model_ckpt = "valhalla/distilbart-mnli-12-1"
onnx_path = f"onnx/bart-large-clf/"
!python -m transformers.onnx --model={model_ckpt} --feature="sequence-classification" {onnx_path}

# Run with ONNX Runtime
ort_session = ort.InferenceSession(f"{onnx_path}model.onnx")
# Note we have `logits` for sequence classification heads
onnx_named_outputs = ["logits"]
# This works because the dummy inputs have batch_size=2
inputs = tokenizer(["I loved this movie!"] * 2, return_tensors="np")
onnx_outputs = ort_session.run(onnx_named_outputs, dict(inputs))
# This fails - stack trace below
inputs = tokenizer(["I loved this movie!"] * 3, return_tensors="np")
onnx_outputs = ort_session.run(onnx_named_outputs, dict(inputs))

Stack trace

``` --------------------------------------------------------------------------- InvalidArgument Traceback (most recent call last) /var/folders/28/k4cy5q7s2hs92xq7_h89_vgm0000gn/T/ipykernel_8196/508920182.py in 5 6 inputs = tokenizer(["I loved this movie!"] * 3, return_tensors="np") ----> 7 onnx_outputs = ort_session.run(onnx_named_outputs, dict(inputs)) ~/miniconda3/envs/transformers/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py in run(self, output_names, input_feed, run_options) 186 output_names = [output.name for output in self._outputs_meta] 187 try: --> 188 return self._sess.run(output_names, input_feed, run_options) 189 except C.EPFail as err: 190 if self._enable_fallback: InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running Expand node. Name:'Expand_74' Status Message: invalid expand shape ```

And great detective work in figuring out that shift_tokens_right() was the source of the problem! I think your proposal makes sense and I was able to verify that including your change fixes the problem with the export.

What do you think @michaelbenayoun? If there are no negative consequences with changing shift_tokens_right(), my suggestion is to ask @farzanehnakhaee70 to open a PR to fix the issue.

farzanehnakhaee70 commented 2 years ago

Great. If there is anything I can help from my side, I would be happy to do it.

michaelbenayoun commented 2 years ago

Hi @farzanehnakhaee70, @lewtun, Great catch @farzanehnakhaee70 !! I would say that if you have a working solution you can definitely open a PR!

lewtun commented 2 years ago

Hi @farzanehnakhaee70 before we open a PR, can you please share your environment details by running the command transformers-cli env and copy-and-pasting its output here?

I'd like to know which version of transformers this affects, the type of OS etc

farzanehnakhaee70 commented 2 years ago

Hi @lewtun Sorry for the delay. Here it is:

- `transformers` version: 4.15.0
- Platform: Linux-4.15.0-154-generic-x86_64-with-glibc2.29
- Python version: 3.8.7
- PyTorch version (GPU?): 1.10.1+cu102 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

lewtun commented 2 years ago

Thanks for sharing your environment @farzanehnakhaee70!

I did a fresh install with

pip install transformers[onnxruntime]==4.15

and find I am no longer able to reproduce the error (here's a Colab notebook if you want to verify). This suggests that the error I saw (and possibly in your case too) is a symptom of a problematic environment.

Would you mind doing a fresh install or providing a Colab notebook that reproduces the error? I'd like to be certain that the error is reproducible before we make any changes to the transformers codebase. Thank you!

farzanehnakhaee70 commented 2 years ago

Sure.

farzanehnakhaee70 commented 2 years ago

Hi, Really sorry for the late response. Today I was going to test this model. However, during the test this error occurs!

Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/lib/python3.8/site-packages/transformers/onnx/__main__.py", line 22, in <module>
    from .features import FeaturesManager
  File "/usr/lib/python3.8/site-packages/transformers/onnx/features.py", line 71, in <module>
    class FeaturesManager:
  File "/usr/lib/python3.8/site-packages/transformers/onnx/features.py", line 273, in FeaturesManager
    def get_model_from_feature(feature: str, model: str) -> PreTrainedModel:
NameError: name 'PreTrainedModel' is not defined

Do you also face this issue?

lewtun commented 2 years ago

Hi @farzanehnakhaee70 I am unfortunately not able to reproduce your error - by the looks of it, it could be a problem with your environment. Did you run a fresh install in a clean virtual env with the command I shared above?

farzanehnakhaee70 commented 2 years ago

Thanks for your reply @lewtun I install it inside a fresh container and also with the command you provided. I will test it once more and inform you about the incidence.

farzanehnakhaee70 commented 2 years ago

Hi @lewtun I test it once more with a fresh install. As you said, there isn't any problem. Thanks a lot for your consideration.

lewtun commented 2 years ago

Thanks for double-checking @farzanehnakhaee70 ! Does this mean we can close this issue?

farzanehnakhaee70 commented 2 years ago

Hi @lewtun Thanks a lot. For sure.

huggingface / transformers

Error while converting distilbart-mnli-12-1 model to ONNX #15123