Open jesusSant opened 1 year ago
Hey, thanks for opening an issue Jesus!
Let me move it to Optimum where I'm sure folks will know how to help you out.
Hi @jesusSant could you please the script or code used to generate the onnx model for Roberta encoder decoder please?
Hello, When I try roberta2roberta encoderdecoder model using ORTModelForSeq2SeqLM, I get an error like "encoder-decoder is not supported yet". How can I resolve this error?
Hi @burakaytan, could you share a code snippet to reproduce this error please?
Hi @regisss , While the part EncoderDecoderModel line is working properly, but ORTModelForSeq2SeqLM part gives the error. This is a local model from local path
roberta_shared = ORTModelForSeq2SeqLM.from_pretrained('Model/557500', from_transformers=True)
@burakaytan Could you share the complete error message here? It is likely that your custom model doesn't correspond to a model type that is supported in Optimum for ONNX export. You may need to convert it using a custom ONNX config.
@regisss I've uploaded the model to the huggingface environment so you can test it, and I'm sharing the code snippet below. If you activate the EncoderDecoderModel part, you can see it working, the ORTModelForSeq2SeqLM part gives an error, I am sharing the error below.
from transformers import RobertaTokenizerFast
from transformers import EncoderDecoderModel
from optimum.onnxruntime import ORTModelForSeq2SeqLM
tokenizer = RobertaTokenizerFast.from_pretrained('burakaytan/encoder_decoder_test', max_len=128)
tokenizer.bos_token = tokenizer.cls_token
tokenizer.eos_token = tokenizer.sep_token
#roberta_shared = EncoderDecoderModel.from_pretrained('burakaytan/encoder_decoder_test')
roberta_shared = ORTModelForSeq2SeqLM.from_pretrained('burakaytan/encoder_decoder_test', from_transformers=True)
def generate_text(text,num_return=3):
inputs = tokenizer([text], padding="max_length", truncation=True, max_length=128, return_tensors="pt")
input_ids = inputs.input_ids#.to("cuda")
attention_mask = inputs.attention_mask#.to("cuda")
outputs = roberta_shared.generate(input_ids, attention_mask=attention_mask,
num_beams=3,
repetition_penalty=3.0,
length_penalty=2.0,
return_dict_in_generate=True,
output_scores=True,
num_return_sequences = num_return,
pad_token_id=2
)
outputs = outputs.get('sequences')
output_str = tokenizer.batch_decode(outputs, skip_special_tokens=True)
return output_str
print(generate_text('nlp'))
The error:
KeyError: "encoder-decoder is not supported yet. Only {'mobilenet-v2', 'marian', 'unispeech', 'squeezebert', 'roberta', 'groupvit', 'donut-swin', 'mpnet', 'deit', 'convbert', 'wav2vec2-conformer', 'opt', 'levit', 'wavlm', 'pegasus', 'ibert', 'bart', 'data2vec-vision', 'gpt2', 'm2m-100', 'sew-d', 'roformer', 'imagegpt', 'splinter', 'bert', 'speech-to-text', 'convnext', 'lilt', 'mobilebert', 'llama', 't5', 'xlm', 'gptj', 'sew', 'mt5', 'poolformer', 'pix2struct', 'regnet', 'hubert', 'owlvit', 'resnet', 'blenderbot', 'yolos', 'perceiver', 'swin', 'whisper', 'bloom', 'data2vec-text', 'unispeech-sat', 'mobilevit', 'clip', 'longt5', 'deberta', 'audio-spectrogram-transformer', 'vit', 'distilbert', 'nystromformer', 'gpt-neox', 'wav2vec2', 'gpt-neo', 'sam', 'mobilenet-v1', 'beit', 'mbart', 'vision-encoder-decoder', 'electra', 'segformer', 'layoutlmv3', 'data2vec-audio', 'layoutlm', 'blenderbot-small', 'flaubert', 'cvt', 'camembert', 'detr', 'codegen', 'albert', 'xlm-roberta', 'deberta-v2'} are supported. If you want to support encoder-decoder please propose a PR or open up an issue."
@burakaytan EncoderDecoderModel
returns a model of type encoder-decoder
with many potential encoders and decoders. You could try it with a custom ONNX config as presented here: https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model#customize-the-export-of-official-transformers-models
@regisss Although encoder decoder models may contain many variations. The EncoderDecoderModel class can decode all of these, doesn't that mean it's converting related models to a standard format? Can't it be done onnx format of ORTModelForSeq2SeqLM class with the same logic?
I looked at the link you shared, but in order to convert a roberta2roberta model to that format, I need to read and understand the onxx infrastructure and apply it to the end. It doesn't have an easily available method like ORTModelForSeq2SeqLM.
will be there a quick solution that can be applied to onxx for the EncoderDecoderModel in the future, I wonder how can I convert the roberta2roberta model to onnx format in the fastest way?
Many thanks for your quick replies and directions.
First of all, thank you very much for making our lives easier with the work you do at huggingface, congratulations! We have a model based on the encoder-decoder architecture, made up of 2 RoBERTa. The model works quite well, but unfortunately its inference time is quite high (about 400ms to generate a sentence of about 7 tokens). We would like to reduce that time, and have opted for ONNX and Optimum. We have managed to export the model to ONNX, generating an encoder_model.onnx, a decoder_model.onnx, and a decoder_with_past_model.onnx. We can load this exported model using
ORTModelForSeq2SeqLM.from_pretrained(·)
, and therefore, it allows us to use thegenerate(·)
method. The problem is that the model in ONNX format is 2 times slower than the model without exporting. We have seen some similar issues: #365, #362. We believe that the exported model is simply in another format, and therefore does not necessarily have to be faster than the base model. Because of all this, our last step has been to useORTOptimizer.from_pretrained(·)
to apply graph optimization (operator fusion and constant folding) to speed up latency and inference. Unfortunately, we have not achieved the latter. TheORTOptimizer.from_pretrained(·)
method expects, in addition to the ONNX model, a config.json file. We do not have this file, since it is not generated when exporting the model to ONNX. We have made several attempts to generate and save configuration files, even with the base model's config.json, but not even crossing our fingers we have succeeded 😢 We would appreciate if someone (@lewtun?) could point us to some lines so we can continue... Thank you very much. Best, Jesus.