gemelo-ai / vocos

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
https://gemelo-ai.github.io/vocos/
MIT License
714 stars 83 forks source link

combine with superresolution #33

Open Flux9665 opened 8 months ago

Flux9665 commented 8 months ago

Thank you for open sourcing this great work!

One of the great advantages I see in vocoders operating in the time domain is how easy it is to combine the vocoding task with superresolution. You just upsample some more and simply use an audio with a higher samplingrate as the target signal. Is the same somehow possible with Vocos? Could I train a model that uses 16kHz spectrograms as the input but produces a 24kHz wave?

lexkoro commented 6 months ago

@Flux9665 Yes it's possible. Just like you said, you would feed 16kHz specs but have 24kHz as target.

sankar-mukherjee commented 5 months ago

What setting in the config do i need to change for this?

bzp83 commented 1 month ago

interested in this as well... did you find out how to do it @sankar-mukherjee ?