vincentherrmann / pytorch-wavenet

An implementation of WaveNet with fast generation
MIT License
976 stars 228 forks source link

CUDNN Dilation #5

Open Algomancer opened 6 years ago

Algomancer commented 6 years ago

What is the motivations for having a custom dilation function rather than using pytorchs inbuilt dilation?

vincentherrmann commented 6 years ago

The dilation gets set back to 1 after each block. This means that the block needs to have multiple outputs with an interval of one sample. We could achieve this by using multiple dilated convolutions in parallel and concatenating the results, but I think it's clearer to reshape the input of each layer so that we move all odd indices from the time dimension to new indices in the batch dimension - which is exactly what dilate() does. This also has the advantage that we can replace dilate() with a dilation function that buffers previous results in queues and then get fast generation almost for free.

For the last block we could indeed use dilated convolutions because we need only one output. But this would make the code unnecessarily complex, I think. And as it is we can calculate multiple successive outputs at once, although I don't know how useful this is because successive samples are usually very highly correlated.

f90 commented 5 years ago

I am wondering if you have any data/intuition on whether using Pytorchs inbuilt dilated convs would speed up training time a lot? I am noticing that the network is immensely slow to train even with just 1-2M parameters...