Open dengyan opened 7 years ago
Using a p3x16large AWS instance NVIDIA Tesla V100 CUDA 9
This appears to run 10x the speed of dengyan's setup.
It takes us 1000 seconds to generate 4 minute audio files.
If we generate 100 of these in parallel that's 24 seconds of generative audio for every 1 second of processing
If we generate 1 of these: That's 0.24 seconds of generative audio for every 1 second of processing
I found that SampleRNN need to be run in parallel to get fast generation speed. It takes only about 500 seconds for generating 200 utterances, each with a length of 8 seconds speech. But it will be very time costing if only run one sentence in generation, more than 40 seconds for 1 second speech. It seems it's not faster than Wavenet. Does anyone have some ideas on speeding up it?