lmnt-com / diffwave

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
Apache License 2.0
754 stars 111 forks source link

Class-conditional waveform generation on the SC09 dataset #29

Closed YIN95 closed 2 years ago

YIN95 commented 2 years ago

Hi,

I was wondering how the class-conditional generation is implemented.

Thanks!

sharvil commented 2 years ago

Class-conditional generation hasn't been implemented in this codebase.

It should be quite straightforward to modify the code to support it since the architecture change is quite small (see Sec 3.2 / Global Conditioner). Instead of conditioning on a mel spectrogram, you'd just use an nn.Embedding layer to map a discrete label to a 128-dim vector, and expand it across the time dimension to match the target signal.