vvolhejn / thesis

ETH Zürich MSc Thesis: Accelerating Neural Audio Synthesis
Apache License 2.0
17 stars 1 forks source link

Implement multiband decomposition #13

Closed vvolhejn closed 2 years ago

vvolhejn commented 2 years ago

Use DSP magic to decompose a 16-kHz signal into N signals sampled at (16/N) kHz.

This is something RAVE is using (and some previous papers as well) and it helps both the encoder and the decoder. For encoder, it expands the receptive field of CNNs (and RNNs as well, kind of). For decoder, it makes generation a lot faster.

vvolhejn commented 2 years ago

Done, maybe we can implement the fast version later.