Ki6an / fastT5

⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
Apache License 2.0
564 stars 72 forks source link

the method get_onnx_model() should not need the path to original model #38

Open piegu opened 2 years ago

piegu commented 2 years ago

Hi,

After having obtained our ONNX model thanks to the export_and_get_onnx_model() method, we want to load it in order to use it in production.

There is a method called get_onnx_model() which requires the following arguments: model_name_or_path, onnx_models_path=saved_models_path and quantized=True

This is conceptually strange: why do I need to access the original model (via model_name_or_path) when I want to use the ONNX one?

I explored the code and figured out that model_name_or_path is needed for 2 things: 1) get the model name that was used to create the ONNX files names 2) get model name configuration

In order to avoid passing the model_name_or_path argument in get_onnx_model(), the export_and_get_onnx_model() method could save the model name configuration in the ONNX folder and could arrange a standard model name for ONNX files names as we can now (in the latest version of fastt5) customize ONNX folder path.

What do you think?

Ki6an commented 2 years ago

yes, it is bit confusing, it should be model_name, not model_name_or_path, I'll make this change in the next update.

the purpose of the model_name (the current model_name_or_path) is to select a particular model from a custom folder if you've stored more than one type of model in that folder.

the export_and_get_onnx_model() method could save the model name configuration in the ONNX folder and could arrange a standard model name for ONNX files names as we can now (in the latest version of fastt5) customize ONNX folder path.

can you elaborate more on this.

piegu commented 2 years ago

it should be model_name, not model_name_or_path

I agree. This is exactly my point: the name of the model is important to get back our ONNX model with get_onnx_model(), not the path to the model.

My second point is that get_onnx_model() calls the class OnnxT5 in step 4, and this class is initialized with the following code:

config = AutoConfig.from_pretrained(
            model_or_model_path, use_auth_token=get_auth_token()
        )

I think we could save the config file through export_and_get_onnx_model() . If not, we need model_or_model_path as argument of get_onnx_model() but we do not want that.