Generic Global Conditioning

ibab / tensorflow-wavenet

A TensorFlow implementation of DeepMind's WaveNet paper

MIT License

5.41k stars 1.29k forks source link

A suggestion: Having an option of loading in JSON-files with an array of gc_channels amount of floats per audio-file (foo.wav + foo.json) that is then used as global conditioning. That way, one can devise whatever GC-schema is appropriate for the application, this could also work with other type of data-readers. On generation, one would also specify a json-file for conditioning.

Use cases: Music (Wavenet paper specifies using global conditioning with tags and music descriptors to train and generate music of specific genres), Speech generation (map voice descriptors such as formant content and average fundamental to scalars and see if those can be interpolated on generation), Sound experiments (can we use wavenet to emulate specific synthesizers?)

ibab / tensorflow-wavenet

Generic Global Conditioning #185