bytedance / Make-An-Audio-2

a text-conditional diffusion probabilistic model capable of generating high fidelity audio.
https://make-an-audio-2.github.io
MIT License
99 stars 12 forks source link

stereo and higher sampling rates #2

Open wlf0322 opened 3 weeks ago

wlf0322 commented 3 weeks ago

I'd like to ask if this model, which is now 16KHz and mono, supports stereo and higher sampling rates

Darius-H commented 3 weeks ago

The pre-trained model does not support higher sample rates and stereo audio generation. Higher resolution and stereo audio require retraining the model with the corresponding data.