microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.59k stars 2.92k forks source link

Optimize Albert HuggingFace model #10304

Closed danielbellhv closed 2 years ago

danielbellhv commented 2 years ago

Based on SO post.

Goal: Amend this Notebook to work with albert-base-v2 model

Kernel: conda_pytorch_p36.

Section 2.1 exports the finalised model. It too uses a BERT specific function. However, I cannot find an equivalent for Albert.

I've successfully implemented alternatives for Albert up until this section.

Code:

# optimize transformer-based models with onnxruntime-tools
from onnxruntime_tools import optimizer
from onnxruntime_tools.transformers.onnx_model_bert import BertOptimizationOptions

# disable embedding layer norm optimization for better model size reduction
opt_options = BertOptimizationOptions('bert')
opt_options.enable_embed_layer_norm = False
...

Do functions for Optimizing and Quantizing an Albert model exist?

Update: You can run Quantization in the notebook, without running Optimization. You just need to remove '.opt.' from code, that is an indicative of optimised filenames.

danielbellhv commented 2 years ago

Optimise any PyTorch model, using torch_optimizer.

Installation:

pip install torch_optimizer

Implementation:

import torch_optimizer as optim

# model = ...
optimizer = optim.DiffGrad(model.parameters(), lr=0.001)
optimizer.step()

Source

torch.save(model.state_dict(), PATH)

Source