PyTorch implementation of Synthesizing Audio with Generative Adversarial Networks(Chris Donahue, Feb 2018).
Befor running, make sure you have the sc09
dataset, and put that dataset under your current filepath.
Installation
sudo apt-get install libav-tools
Download dataset
sc09
: sc09 raw WAV files, utterances of spoken english words '0'-'9'piano
: Piano raw WAV filesRun
For sc09
task, make sure sc09
dataset under your current project filepath befor run your code.
$ python train.py
SC09
dataset, 4 X Tesla P40 takes nearly 2 days to get reasonable result.piano
piano dataset, 2 X Tesla P40 takes 3-6 hours to get reasonable result.BATCH_SIZE
from 10 to 32 or 64 can acquire shorter per-epoch time on multiple-GPU but slower gradient descent learning rate.Generated "0-9": https://soundcloud.com/mazzzystar/sets/dcgan-sc09
Generated piano: https://soundcloud.com/mazzzystar/sets/wavegan-piano
Loss curve:
This repo is based on chrisdonahue's and jtcramer's implementation.