Closed ironflood closed 6 years ago
I think you're guessing correctly. WaveNet can only ever predict one sample in advance, so all but the last one have to be available as input. The default case, as it is explained in the paper, is output_length=1. Using a greater output_length is just a trick for more effective training. This is because we can reuse most of the calculated values in the hidden layers to compute the next predicted sample. It looks something like this (o is the predicted sample):
|----receptive_field----|o
|----receptive_field----|o
|----receptive_field----|o
|----receptive_field----|o
|----receptive_field----|o
|----receptive_field----|o
|----receptive_field----|o
|----receptive_field----|o
|----receptive_field----|o
|--output_length--|
example: | | | | | | | | | | | | | | | | | | | | |
target: | | | | | | | | | |
Thank you @vincentherrmann, your explanation is much appreciated!
Hi @vincentherrmann. Thanks a lot for sharing, learning a great deal through your code! This isn't an issue, only a question about the code.
Visually, you describe in code the model input (= receptive_field?) and target being as:
You also said a few days ago in a similar thread:
However from my observation the target data of
output_length=16
shares the same values as the end of input sequence generated by WavenetDataset, apart from the last value. Shouldn't the target sequence be the next sequence of data following the input instead? Or put the opposite way, I don't understand why the target sequence has aoutput_length-1
data overlap with the end of the input, it should be the future data to be predicted of lengthoutput_length
. Shouldn't the one_hot input sequence be of lengthmodel.receptive_field
?To keep it visual like in code, I observe the following:
Any pointers would be greatly appreciated ) If I had to guess I'd say I'm missing something within the training loop, maybe it includes a moving window of size receptive_field to predict one by one the last_value+1 index?