declare-lab / tango

A family of diffusion models for text-to-audio generation.
https://tango2-web.github.io/
Other
1.09k stars 88 forks source link

Producing audio in different Sample Rate #29

Open cvillela opened 1 year ago

cvillela commented 1 year ago

Hey!

I was wondering if it was possible to train the model in 48kHz audio, and then generate audio directly in 48kHz. Has anyone attempted this?

deepanwayx commented 1 year ago

That is definitely possible and would be really great to have! We could not try this due to computational constraints.

cvillela commented 1 year ago

Awesome! Will try it out. How much VRAM do you think is necessary for attempting it?

cvillela commented 1 year ago

@deepanwayx Also, I see that the "Tango Prompt Bank" is all in 16.000Hz. Would you guys have the raw dataset, not resampled, available?