Make the default mel spec compatible with vocos

Vocos is a really small and fast vocoder — we previously used it for Voicebox as it handles both mel spec and Encodec-based vocoding.

This change uses their particular mel spec recipe as the default (a seemingly common spec for TTS systems), so you can pip install vocos and then pass vocos.decode to the sampling function and get audio output.

This isn't strictly necessary to take, but I thought it might be useful for folks who are hoping to train a working network 'out of the box' without needing to think about the transform!

lucidrains / e2-tts-pytorch

Make the default mel spec compatible with vocos #13