Closed chaodreaming closed 2 years ago
Also reproduced on mac
Fast fix: run with --atol 1e-4
.
The --atol 1e-4 method can be run, but the running result seems incorrect
I think it's not a big difference for such big NN. In ouputs you have values much greater than 1e-4, so problem probably in some other place.
After exporting, the data dimension is incorrect. Can you give me a code? Thank you very much
Can you give me a code?
All code is here:)
Why do you think, that problem in dimensions? They realy checked here
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM from transformers.models.marian import MarianOnnxConfig
model_ckpt = "Helsinki-NLP/opus-mt-en-de" tokenizer = AutoTokenizer.from_pretrained(model_ckpt) ref_model = AutoModelForSeq2SeqLM.from_pretrained(model_ckpt)
feature = "seq2seq-lm" onnx_path = f"onnx/{model_ckpt}-{feature}/"
!python -m transformers.onnx --model={model_ckpt} --atol=1e-4 --feature={feature} {onnx_path}
batch_size = 4 encoder_inputs = tokenizer( ["Studies have been shown that owning a dog is good for you"] * batch_size, return_tensors="np", ) decoder_inputs = tokenizer( ["Studien haben gezeigt dass es hilfreich ist einen Hund zu besitzen"]
ort_session = ort.InferenceSession(f"{onnx_path}model.onnx") onnx_config = MarianOnnxConfig(ref_model.config, task=feature) onnx_named_outputs = list(onnx_config.outputs.keys()) onnx_outputs = ort_session.run(onnx_named_outputs, all_inputs)
How to get results
So... And what your problem here?
How to get text
The code is not exactly the same, but the problem is the same. The other is Marian
The result is a four-dimensional tensor, but no matter how it is processed, it cannot be decoded to get the correct translation, so I think the result is incorrect
4 dims = number_of_outputs x batch_size x n_words x embedding
The dimensions can still be solved, but the decoder input is incredible, it is obvious that this code knows the result in advance, it is impossible for me to predict the result in advance when I translate
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM from onnxruntime import InferenceSession
tokenizer=AutoTokenizer.from_pretrained("opus-mt-en-zh") session = InferenceSession("opus-mt-en-zh-onnx-301/model.onnx") inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="pt") outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs))
Traceback (most recent call last):
File "
@CatchDr You did not provide decoder inputs, hence the error message. Have you tried to do what is suggested in #18518?
He and I are actually a problem, he this also did not solve the
He and I are actually a problem, he this also did not solve the
The code snippet you shared and that fails does not do what is suggested in #18518. Could you try the following?
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from onnxruntime import InferenceSession
tokenizer=AutoTokenizer.from_pretrained("opus-mt-en-zh")
session = InferenceSession("opus-mt-en-zh-onnx-301/model.onnx")
inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="pt")
inputs["decoder_input_ids"] = torch.tensor([0], dtype=torch.long)
inputs["decoder_attention_mask"] = torch.tensor([1], dtype=torch.long)
outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs))
Please wait, about 10 minutes.
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM from transformers.models.marian import MarianOnnxConfig import onnxruntime as ort model_ckpt = "Helsinki-NLP/opus-mt-en-zh" tokenizer = AutoTokenizer.from_pretrained(model_ckpt) ref_model = AutoModelForSeq2SeqLM.from_pretrained(model_ckpt)
feature = "seq2seq-lm" onnx_path = f"onnx/{model_ckpt}-{feature}/"
!python -m transformers.onnx --model={model_ckpt} --atol=1e-4 --feature={feature} {onnx_path}
import torch from transformers import AutoTokenizer, AutoModelForSeq2SeqLM from onnxruntime import InferenceSession
tokenizer=AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh") session = InferenceSession("onnx/Helsinki-NLP/opus-mt-en-zh/model.onnx") inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="pt") inputs["decoder_input_ids"] = torch.tensor([0], dtype=torch.long) inputs["decoder_attention_mask"] = torch.tensor([1], dtype=torch.long) outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs)) outputs Very sorry, as a rookie, said many times to describe clearly, here is all the code, run it
I tried to make some changes, but the dimensions seem to be incorrect again
@CatchDr The result you get is correct. Some post-processing is necessary to generate the whole sentence. If you just want to convert your model to the ONNX format and translate sentences, I suggest you to use Optimum. It will do all the generation work for you. For example:
from transformers import AutoTokenizer, pipeline
from optimum.onnxruntime import ORTModelForSeq2SeqLM
model = ORTModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-zh", from_transformers=True)
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh")
onnx_translation = pipeline("translation_en_to_zh", model=model, tokenizer=tokenizer)
result = onnx_translation("Using DistilBERT with ONNX Runtime!")
I would like to know how to post-process to get the right result, because I have tried this solution you mentioned and the result is very poor
text="Vehicle detection technology is of great significance for realizing automatic monitoring and AI-assisted driving systems. The state-of-the-art object detection method, namely, a class of YOLOv5, has often been used to detect vehicles. However, it suffers some challenges, such as a high computational load and undesirable detection rate. To address these issues, an improved lightweight YOLOv5 method is proposed for vehicle detection in this paper. In the presented method, C3Ghost and Ghost modules are introduced into the YOLOv5 neck network to reduce the floating-point operations (FLOPs) in the feature channel fusion process and enhance the feature expression performance. A convolutional block attention module (CBAM) is introduced to the YOLOv5 backbone network to select the information critical to the vehicle detection task and suppress uncritical information, thus improving the detection accuracy of the algorithm. Furthermore, CIoU_Loss is considered the bounding box regression loss function to accelerate the bounding box regression rate and improve the localization accuracy of the algorithm. To verify the performance of the proposed approach, we tested our model via two case studies, i.e., the PASCAL VOC dataset and MS COCO dataset. The results show that the detection precision of the proposed model increased 3.2%, the FLOPs decreased 15.24%, and the number of model parameters decreased 19.37% compared with those of the existing YOLOv5. Through case studies and comparisons, the effectiveness and superiority of the presented approach are demonstrated." You can try to translate this text for comparison, the result is very poor
@CatchDr You can take a look at this example and change the arguments of the generate
method if you want to decode your outputs in a different way (see here for the possible decoding strategies). But maybe this model is simply not good enough for what you are trying to achieve.
outputs = session.run(output_names=["logits"], input_feed=dict(inputs)) There should be some errors here
I've tried every decoding method I can think of and can't get the results I want, expecting something completely different, so I'm asking for help here
@CatchDr The result you get is correct. Some post-processing is necessary to generate the whole sentence. If you just want to convert your model to the ONNX format and translate sentences, I suggest you to use Optimum. It will do all the generation work for you. For example:
from transformers import AutoTokenizer, pipeline from optimum.onnxruntime import ORTModelForSeq2SeqLM model = ORTModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-zh", from_transformers=True) tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh") onnx_translation = pipeline("translation_en_to_zh", model=model, tokenizer=tokenizer) result = onnx_translation("Using DistilBERT with ONNX Runtime!")
Hi I tried this code and it seems not working ?
Edit : It indeed works, I forgot to print the result
@sgugger How to convert opus model to onnx ?
Never mind got it from above
I got problem to run inference after conversion:
model = ORTModelForSeq2SeqLM.from_pretrained("/kaggle/working/onnx/Helsinki-NLP/opus-mt-en-zh-seq2seq-lm")
seems cannot find model.onnx ? Since after conversion there's 2 onnx which are decoder_model.onnx and encoder_model.onnx
How to fix ?
System Info
transformers:4.22.2 python3.8.4 win10
raise ValueError( ValueError: Outputs values doesn't match between reference model and ONNX exported model: Got max absolute difference of: 2.8133392333984375e-05
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
python -m transformers.onnx --model=Helsinki-NLP/opus-mt-en-zh onnx/
Expected behavior
Export onnx and translate through onnx https://www.kaggle.com/code/catchlife/translate-opt Custom export has incorrect translation results