chaodreaming commented 2 years ago

System Info

transformers:4.22.2 python3.8.4 win10

raise ValueError( ValueError: Outputs values doesn't match between reference model and ONNX exported model: Got max absolute difference of: 2.8133392333984375e-05

Who can help?

No response

Information

[ ] The official example scripts
[X] My own modified scripts

Tasks

[x] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

python -m transformers.onnx --model=Helsinki-NLP/opus-mt-en-zh onnx/

Expected behavior

Export onnx and translate through onnx https://www.kaggle.com/code/catchlife/translate-opt Custom export has incorrect translation results

kventinel commented 2 years ago

Also reproduced on mac

kventinel commented 2 years ago

Fast fix: run with --atol 1e-4.

chaodreaming commented 2 years ago

The --atol 1e-4 method can be run, but the running result seems incorrect

kventinel commented 2 years ago

I think it's not a big difference for such big NN. In ouputs you have values much greater than 1e-4, so problem probably in some other place.

chaodreaming commented 2 years ago

After exporting, the data dimension is incorrect. Can you give me a code? Thank you very much

kventinel commented 2 years ago

Can you give me a code?

All code is here:)

Why do you think, that problem in dimensions? They realy checked here

chaodreaming commented 2 years ago

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM from transformers.models.marian import MarianOnnxConfig

model_ckpt = "Helsinki-NLP/opus-mt-en-de" tokenizer = AutoTokenizer.from_pretrained(model_ckpt) ref_model = AutoModelForSeq2SeqLM.from_pretrained(model_ckpt)

Export model

feature = "seq2seq-lm" onnx_path = f"onnx/{model_ckpt}-{feature}/"

Run this from a Jupyter notebook

!python -m transformers.onnx --model={model_ckpt} --atol=1e-4 --feature={feature} {onnx_path}

Test export with inputs

batch_size = 4 encoder_inputs = tokenizer( ["Studies have been shown that owning a dog is good for you"] * batch_size, return_tensors="np", ) decoder_inputs = tokenizer( ["Studien haben gezeigt dass es hilfreich ist einen Hund zu besitzen"]

batch_size, return_tensors="np", ) all_inputs = { "input_ids": encoder_inputs["input_ids"], "attention_mask": encoder_inputs["attention_mask"], "decoder_input_ids": decoder_inputs["input_ids"], "decoder_attention_mask": decoder_inputs["attention_mask"], }
Generate ONNX outputs

ort_session = ort.InferenceSession(f"{onnx_path}model.onnx") onnx_config = MarianOnnxConfig(ref_model.config, task=feature) onnx_named_outputs = list(onnx_config.outputs.keys()) onnx_outputs = ort_session.run(onnx_named_outputs, all_inputs)

chaodreaming commented 2 years ago

How to get results

kventinel commented 2 years ago

So... And what your problem here?

chaodreaming commented 2 years ago

How to get text

chaodreaming commented 2 years ago

The code is not exactly the same, but the problem is the same. The other is Marian

chaodreaming commented 2 years ago

The result is a four-dimensional tensor, but no matter how it is processed, it cannot be decoded to get the correct translation, so I think the result is incorrect

kventinel commented 2 years ago

4 dims = number_of_outputs x batch_size x n_words x embedding

chaodreaming commented 2 years ago

The dimensions can still be solved, but the decoder input is incredible, it is obvious that this code knows the result in advance, it is impossible for me to predict the result in advance when I translate

chaodreaming commented 2 years ago

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM from onnxruntime import InferenceSession

tokenizer=AutoTokenizer.from_pretrained("opus-mt-en-zh") session = InferenceSession("opus-mt-en-zh-onnx-301/model.onnx") inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="pt") outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs))

chaodreaming commented 2 years ago

Traceback (most recent call last): File "", line 1, in File "/home/xieyouxi/anaconda3/envs/HuggingFace-torch-gpu/lib/python3.7/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 196, in run raise ValueError("Model requires {} inputs. Input Feed contains {}".format(num_required_inputs, num_inputs)) ValueError: Model requires 4 inputs. Input Feed contains 2

chaodreaming commented 2 years ago

https://github.com/huggingface/transformers/issues/18518

regisss commented 2 years ago

@CatchDr You did not provide decoder inputs, hence the error message. Have you tried to do what is suggested in #18518?

chaodreaming commented 2 years ago

He and I are actually a problem, he this also did not solve the

regisss commented 2 years ago

He and I are actually a problem, he this also did not solve the

The code snippet you shared and that fails does not do what is suggested in #18518. Could you try the following?

import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from onnxruntime import InferenceSession

tokenizer=AutoTokenizer.from_pretrained("opus-mt-en-zh")
session = InferenceSession("opus-mt-en-zh-onnx-301/model.onnx")
inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="pt")
inputs["decoder_input_ids"] = torch.tensor([0], dtype=torch.long)
inputs["decoder_attention_mask"] = torch.tensor([1], dtype=torch.long)
outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs))

chaodreaming commented 2 years ago

Please wait, about 10 minutes.

chaodreaming commented 2 years ago

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM from transformers.models.marian import MarianOnnxConfig import onnxruntime as ort model_ckpt = "Helsinki-NLP/opus-mt-en-zh" tokenizer = AutoTokenizer.from_pretrained(model_ckpt) ref_model = AutoModelForSeq2SeqLM.from_pretrained(model_ckpt)

Export model

feature = "seq2seq-lm" onnx_path = f"onnx/{model_ckpt}-{feature}/"

Run this from a Jupyter notebook

!python -m transformers.onnx --model={model_ckpt} --atol=1e-4 --feature={feature} {onnx_path}

import torch from transformers import AutoTokenizer, AutoModelForSeq2SeqLM from onnxruntime import InferenceSession

tokenizer=AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh") session = InferenceSession("onnx/Helsinki-NLP/opus-mt-en-zh/model.onnx") inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="pt") inputs["decoder_input_ids"] = torch.tensor([0], dtype=torch.long) inputs["decoder_attention_mask"] = torch.tensor([1], dtype=torch.long) outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs)) outputs Very sorry, as a rookie, said many times to describe clearly, here is all the code, run it

chaodreaming commented 2 years ago

I tried to make some changes, but the dimensions seem to be incorrect again

regisss commented 2 years ago

@CatchDr The result you get is correct. Some post-processing is necessary to generate the whole sentence. If you just want to convert your model to the ONNX format and translate sentences, I suggest you to use Optimum. It will do all the generation work for you. For example:

from transformers import AutoTokenizer, pipeline
from optimum.onnxruntime import ORTModelForSeq2SeqLM

model = ORTModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-zh", from_transformers=True)
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh")

onnx_translation = pipeline("translation_en_to_zh", model=model, tokenizer=tokenizer)

result = onnx_translation("Using DistilBERT with ONNX Runtime!")

chaodreaming commented 2 years ago

I would like to know how to post-process to get the right result, because I have tried this solution you mentioned and the result is very poor

chaodreaming commented 2 years ago

text="Vehicle detection technology is of great significance for realizing automatic monitoring and AI-assisted driving systems. The state-of-the-art object detection method, namely, a class of YOLOv5, has often been used to detect vehicles. However, it suffers some challenges, such as a high computational load and undesirable detection rate. To address these issues, an improved lightweight YOLOv5 method is proposed for vehicle detection in this paper. In the presented method, C3Ghost and Ghost modules are introduced into the YOLOv5 neck network to reduce the floating-point operations (FLOPs) in the feature channel fusion process and enhance the feature expression performance. A convolutional block attention module (CBAM) is introduced to the YOLOv5 backbone network to select the information critical to the vehicle detection task and suppress uncritical information, thus improving the detection accuracy of the algorithm. Furthermore, CIoU_Loss is considered the bounding box regression loss function to accelerate the bounding box regression rate and improve the localization accuracy of the algorithm. To verify the performance of the proposed approach, we tested our model via two case studies, i.e., the PASCAL VOC dataset and MS COCO dataset. The results show that the detection precision of the proposed model increased 3.2%, the FLOPs decreased 15.24%, and the number of model parameters decreased 19.37% compared with those of the existing YOLOv5. Through case studies and comparisons, the effectiveness and superiority of the presented approach are demonstrated." You can try to translate this text for comparison, the result is very poor

regisss commented 2 years ago

@CatchDr You can take a look at this example and change the arguments of the generate method if you want to decode your outputs in a different way (see here for the possible decoding strategies). But maybe this model is simply not good enough for what you are trying to achieve.

chaodreaming commented 2 years ago

outputs = session.run(output_names=["logits"], input_feed=dict(inputs)) There should be some errors here

chaodreaming commented 2 years ago

I've tried every decoding method I can think of and can't get the results I want, expecting something completely different, so I'm asking for help here

x4080 commented 10 months ago

@CatchDr The result you get is correct. Some post-processing is necessary to generate the whole sentence. If you just want to convert your model to the ONNX format and translate sentences, I suggest you to use Optimum. It will do all the generation work for you. For example:
from transformers import AutoTokenizer, pipeline
from optimum.onnxruntime import ORTModelForSeq2SeqLM

model = ORTModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-zh", from_transformers=True)
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh")

onnx_translation = pipeline("translation_en_to_zh", model=model, tokenizer=tokenizer)

result = onnx_translation("Using DistilBERT with ONNX Runtime!")

Hi I tried this code and it seems not working ?

Edit : It indeed works, I forgot to print the result

x4080 commented 10 months ago

@sgugger How to convert opus model to onnx ?

Never mind got it from above

x4080 commented 10 months ago

I got problem to run inference after conversion:

model = ORTModelForSeq2SeqLM.from_pretrained("/kaggle/working/onnx/Helsinki-NLP/opus-mt-en-zh-seq2seq-lm")

seems cannot find model.onnx ? Since after conversion there's 2 onnx which are decoder_model.onnx and encoder_model.onnx

How to fix ?

huggingface / transformers

python -m transformers.onnx --model=Helsinki-NLP/opus-mt-en-zh onnx/ #19283

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Export model

Run this from a Jupyter notebook

Test export with inputs

Generate ONNX outputs

Export model

Run this from a Jupyter notebook