Open Algomancer opened 6 years ago
The dilation gets set back to 1 after each block. This means that the block needs to have multiple outputs with an interval of one sample. We could achieve this by using multiple dilated convolutions in parallel and concatenating the results, but I think it's clearer to reshape the input of each layer so that we move all odd indices from the time dimension to new indices in the batch dimension - which is exactly what dilate()
does. This also has the advantage that we can replace dilate()
with a dilation function that buffers previous results in queues and then get fast generation almost for free.
For the last block we could indeed use dilated convolutions because we need only one output. But this would make the code unnecessarily complex, I think. And as it is we can calculate multiple successive outputs at once, although I don't know how useful this is because successive samples are usually very highly correlated.
I am wondering if you have any data/intuition on whether using Pytorchs inbuilt dilated convs would speed up training time a lot? I am noticing that the network is immensely slow to train even with just 1-2M parameters...
What is the motivations for having a custom dilation function rather than using pytorchs inbuilt dilation?