Just want to include nn.TemporalConvolution() for your wonderful initialization implementation, plus a little correction on the nn.Linear layer (since weight.size(1) is the fanout and weight.size(2) is the fanin, am I right?). See if this is ok for use and merge.
Just want to include nn.TemporalConvolution() for your wonderful initialization implementation, plus a little correction on the nn.Linear layer (since weight.size(1) is the fanout and weight.size(2) is the fanin, am I right?). See if this is ok for use and merge.