neuralmagic / sparseml

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Apache License 2.0
2.08k stars 148 forks source link

MT5 model convertion with sparseml #775

Closed OriAlpha closed 1 year ago

OriAlpha commented 2 years ago

Hello, I am trying to converting mt5 model to onnx runtime. But i can see that model is not supported. Is there any work around. Error: Unrecognized configuration class <class 'transformers.models.t5.configuration_t5.T5Config'> for this kind of AutoModel: AutoModelForSequenceClassification.

markurtz commented 2 years ago

Hi @OriAlpha, thanks for opening this, and we're looking into it. Putting what we talked over in Slack here for recording:

One note is that if you installed more than a few weeks ago, be sure to upgrade to the latest flows as we included an upgrade to transformers that may fix this. Separately, we'll look into this from our side to see why the T5 models are not loading.

If the upgrade does not work, you can try editing the following script to sub in your model and tokenizer to enable the export to work: https://github.com/neuralmagic/sparseml/blob/main/src/sparseml/transformers/export.py#L159

bfineran commented 2 years ago

Hi @OriAlpha this error is occurring because t5 models are incompatible with loading through transformers. AutoModelForSequenceClassification. Support for generation models from tranformers auto models is not yet added to sparseml.

As @markurtz suggested, you can try editing the script he listed to either load from a generation auto model. As a prototype, the generic transformers.AutoModel may also work but task specific layers may not be included. Happy to provide any code pointers necessary.

OriAlpha commented 2 years ago

I managed to load the model by AutoModel.from_pretrained(pretrained_model_name_or_path=model_path) and it works fine. Now when trying to export model to onnx it fails by, code line: https://github.com/neuralmagic/sparseml/blob/main/src/sparseml/transformers/export.py#L225 error : ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds

bfineran commented 2 years ago

the sample inputs provided to the onnx export trace must have all the inputs required to run the t5 model. it looks like decoder_input_ids may not be provided from the tokenizer. You can try adding this in to the inputs dict as a dummy tensor of ones or zeros to make the trace go through

OriAlpha commented 2 years ago

If possible can you provide an example.

bfineran commented 2 years ago

sure, after inputs are defined in export.py, you can add in

import torch
inputs["decoder_input_ids"] = torch.ones(1, sequence_length, dtype=torch.long)
OriAlpha commented 2 years ago

After adding, decoder_input_ids. I had following error exporting model exceed maximum protobuf size of 2gb, please call torch.onnx.export with use_external_data_format=True, I added use_external_data_format = True, in line https://github.com/neuralmagic/sparseml/blob/main/src/sparseml/pytorch/utils/exporter.py#L473.

But still fails by, ValueError: Messafe onnx.ModelProto exceeds maximum protobuf size of 2GB: 3894105618

OriAlpha commented 2 years ago

I tried to further looked and was able to convert the model. With following changes: https://github.com/neuralmagic/sparseml/blob/main/src/sparseml/pytorch/utils/exporter.py#L495 Adding: onnx.save(onnx_model, file_path, save_as_external_data=True). You could look into adding these in further release.

but i have one question after convertion, it creates multiple binary files. Does the onnx runtime automatically reads data from those file, while inference.

bfineran commented 1 year ago

hi @OriAlpha support for external data saving is on our near term roadmap. The external data format will save model parameters as separate binary files in the same directory as the model protobuf. For onnxruntime you can just pass the path to the model protobuf, deepsparse will require additional feature support that is on the near term roadmpap