Open veqtor opened 8 years ago
I am also interested in using other timeseries (e.g. audio features) as local input. It seems like we only have to modify the operation in https://github.com/ibab/tensorflow-wavenet/blob/master/wavenet/model.py#L210 to incorporate our second time-series 'y' (as they name it in the paper on page 5).
With audio as local input we could explore using wavenets for audio processing, can they be used to simulate reverbs, non-linear time-varying signal processing (chorus/flanger), etc? Furthermore, I think, if we have a built-in sample-rate converter, we could probably train a network on up-sampling a low 8khz or 16khz sample to 44,1khz and estimate what kind of frequencies are missing, this could probably be done without a very long receptive field, perhaps it can also be done to create a bit-depth estimation (converting the bit-depth from 8-bit to 16-bit).
FlipBoard uses deep networks to upscale images, it would make sense that we could do the same for audio, that way we can stick to low sample-rates and bitdepths for generation and attach a cheaper upscaling network to get better quality sound.