declare-lab / tango

A family of diffusion models for text-to-audio generation.
https://tango2-web.github.io/
Other
950 stars 75 forks source link

Can this model used for music generation? #15

Open gandolfxu opened 1 year ago

gandolfxu commented 1 year ago

Quality is first!

No text input. How can i use this repo to train?

deepanwayx commented 1 year ago

You need to create a json file with the music file locations and captions similar to the json files we have provided in our data directory. You can then use those json files in the train.py script for training.

gandolfxu commented 1 year ago

@deepanwayx No caption is available for our music data. How should I prepare the dataset?

deepanwayx commented 1 year ago

Do you have any metadata available about the music files? I think those can be used as captions for training TANGO for music generation. Otherwise, you can try using empty strings as captions, but then the model would be trained for unconditioned music generation, and you will not have any control over the generated outputs.