huggingface / optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
https://huggingface.co/docs/optimum/main/
Apache License 2.0
2.59k stars 473 forks source link

Support onnx export of microsoft/speecht5_hifigan and microsoft/speecht5_vc #1573

Open matbee-eth opened 11 months ago

matbee-eth commented 11 months ago

Feature request

microsoft/speecht5_vc

ValueError: Asked to export a speecht5 model for the task audio-to-audio (auto-detected), but the Optimum ONNX exporter only supports the tasks text-to-audio for speecht5. Please use a supported task. Please open an issue at https://github.com/huggingface/optimum/issues if you would like the task audio-to-audio to be supported in the ONNX export for speecht5.

microsoft/speecht5_hifigan

KeyError: "The task could not be automatically inferred. Please provide the argument --task with the relevant task from token-classification, zero-shot-image-classification, image-to-image, zero-shot-object-detection, semantic-segmentation, text2text-generation, feature-extraction, mask-generation, depth-estimation, stable-diffusion, image-to-text, image-segmentation, audio-classification, text-classification, automatic-speech-recognition, conversational, audio-xvector, stable-diffusion-xl, question-answering, multiple-choice, text-generation, audio-frame-classification, image-classification, fill-mask, object-detection, text-to-audio, masked-im. Detailed error: 'Could not find the proper task name for SpeechT5HifiGan.'"

Motivation

Trying to run "AudioLDM" using ONNX formats to use with onnxruntime-web

Your contribution

Novice with onnx, so at best, my contributions may not be beneficial

matbee-eth commented 11 months ago

Possibly related? https://github.com/huggingface/optimum/pull/1552