Hi there!
Thanks for your awesome work and sharing it.
As per my understanding to the deep voice paper, in the layer inference part (section 5.1.2.(d)) they only bias the skip connection once. Since they implemented the skip projection thru accumulation, they made q^(0) to be the bias and only added that once in the beginning.
However at line 87 in nv_wavenet_reference.cpp, q^(j) got biased by Bskip at each layer. It seems like a little bug or is there anything I am missing?
Hi there! Thanks for your awesome work and sharing it.
As per my understanding to the deep voice paper, in the layer inference part (section 5.1.2.(d)) they only bias the skip connection once. Since they implemented the skip projection thru accumulation, they made q^(0) to be the bias and only added that once in the beginning.
However at line 87 in
nv_wavenet_reference.cpp
, q^(j) got biased byBskip
at each layer. It seems like a little bug or is there anything I am missing?best, Xuan