huggingface / optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
https://huggingface.co/docs/optimum/main/
Apache License 2.0
2.49k stars 447 forks source link

[ONNX export] Musicgen for text-to-audio #1297

Closed xenova closed 5 months ago

xenova commented 1 year ago

Feature request

Musicgen was recently added to 🤗 Transformers (model doc) and it would be great to be able to export those models to ONNX with Optimum.

Motivation

This will allow me to support music generation models in Transformers.js

Your contribution

I will integrate into transformers.js once available in optimum.

kanger45 commented 1 year ago

Hi, I'm also interested in converting musicgen model to onnx format so I can try to deploy it to the device. May i know is it support on Optimum now?

MaiZhiHao commented 1 year ago

It would be great if this feature is done. Btw, how can I get the transformers.js ?

xenova commented 1 year ago

May i know is it support on Optimum now?

Not yet 😇 cc @fxmarty

Btw, how can I get the transformers.js ?

You can check out the repo here or the documentation here. Since musicgen is not yet available in Optimum, however, it won't be available in transformers.js until then.

kanger45 commented 1 year ago

hi @xenova,

May i know if have a plan or schedule to support Optimum for convert it to ONNX model?

zeke-john commented 8 months ago

any update?

fxmarty commented 5 months ago

Hi @kanger45 @MaiZhiHao @zeke-john https://github.com/huggingface/optimum/pull/1779 is merged, which exports Musicgen in several parts to generate audio samples conditioned on a text prompt (Reference: https://huggingface.co/docs/transformers/model_doc/musicgen#text-conditional-generation). This uses the decoder KV cache. The following subcomponents are exported:

This is usable e.g. in transformers.js, there is no implementation in Optimum for the runtime for now.

zeke-john commented 5 months ago

@fxmarty Would this work for fintuned models on Musicgen? I used this repo to finetune the meduim model, and the output is a .pt model.

fxmarty commented 5 months ago

@zeke-john yes, it should work as long as the checkpoint (& model repo) follows Transformers style (e.g. https://huggingface.co/facebook/musicgen-small/tree/main). .bin & .safetensors are supported, not sure about .pt

zeke-john commented 5 months ago

Are there any supported ways to finetune musicgen besides the way i did it, so it stays a transformers model? Or can you convert a .pt model into a transformers model format?

fxmarty commented 5 months ago

@zeke-john You should try to use https://github.com/huggingface/transformers/blob/main/src/transformers/models/musicgen/convert_musicgen_transformers.py which should allow you to do the conversion (audiocraft format to transformers format).

Dannyjhl commented 3 months ago

@fxmarty after we export several onnx model, how can we run these onnx model locally?