huggingface / optimum

πŸš€ Accelerate training and inference of πŸ€— Transformers and πŸ€— Diffusers with easy to use hardware optimization tools
https://huggingface.co/docs/optimum/main/
Apache License 2.0
2.57k stars 470 forks source link

export `audio-classification` Whisper to TFLite #1672

Open Gabriel-Kissin opened 9 months ago

Gabriel-Kissin commented 9 months ago

Feature request

export audio-classification model based on OpenAI whisper model to other formats

Motivation

After loading OpenAI's whisper:

transformers.WhisperForAudioClassification.from_pretrained(
    "openai/whisper-tiny",
    num_labels=n_labels, 
    label2id=labels2id, 
    id2label=id2labels,
)

and fine-tuning it to use for audio classification, I'd like to export the saved model.safetensors model to TFLite.

Ways I've tried:

1) Directly: Using the command optimum-cli export tflite --task audio-classification ... - raises error

ValueError: Unrecognized configuration class 
<class 'transformers.models.whisper.configuration_whisper.WhisperConfig'> 
for this kind of AutoModel: TFAutoModelForAudioClassification. 
Model type should be one of Wav2Vec2Config

2) Via ONNX: I've previously had difficulties exporting direct to TFLite but was able to export to ONNX, and was then able to move it to TF and then TFLite. Trying that route this time however (optimum-cli export onnx --task audio-classification ...) failed with error

ValueError: Asked to export a whisper model for the task audio-classification, 
but the Optimum ONNX exporter only supports the tasks 
feature-extraction, feature-extraction-with-past, automatic-speech-recognition, 
automatic-speech-recognition-with-past for whisper. 
Please use a supported task. 
Please open an issue at https://github.com/huggingface/optimum/issues 
if you would like the task audio-classification to be supported 
in the ONNX export for whisper

Any chance this can be added - both exporting to TFLite and to ONNX would be useful.

Many thanks!!!

Your contribution

Apologies, cannot contribute

AnonymUnsichtbar commented 9 months ago

Having the same Problem. Even to save the Model in the Tensorflow Format is kind of difficult, because I just cannot save encoder and decoder seperatly and onnx to tflite adds custom ops that I don't want.

fxmarty commented 8 months ago

@AnonymUnsichtbar @Gabriel-Kissin To be fair the TFLite support is currently quite minimal, only a few simple architectures are supported ('albert', 'bert', 'camembert', 'convbert', 'deberta', 'deberta_v2', 'distilbert', 'electra', 'flaubert', 'mobilebert', 'mpnet', 'resnet', 'roberta', 'roformer', 'xlm', 'xlm_roberta').

We haven't added support for decoders/encoder-decoder yet.

tflite adds custom ops

I wonder if this is related to the fact that we use subgraphs in ONNX to handle past key values (KV cache) in a single ONNX model. Have you tried to export to TFLite e.g. a decoder_with_past_model.onnx instead of decoder_model_merged.onnx?

WeiXiaoSummer commented 8 months ago

I'm also interested in being able to export audio-classification whisper to onnx! This would be a huge help!!!

fxmarty commented 8 months ago

Hi, export of Transformers Whisper to ONNX for audio-classification is merged in https://github.com/huggingface/optimum/pull/1727, for example:

optimum-cli export onnx --model shhossain/whisper-tiny-bn-emo whisper_onnx

For TFLite, there are no short term plans but I am happy to review PRs.