Closed karpathy closed 6 years ago
Thanks for your code snippet. I will update my code with this tiny snippet.
If you look at th BR class, it has both batch normalization and activation. The variable name (self.bn) is confusing, but I will add a comment next to it so that it is clear. Thanks for pointing it out.
self.bn = BR(nOut)
Enjoyed reading the paper!
I had a question about a possible paper - code discrepancy. In
DilatedParllelResidualBlockB
, the activation seems to be missing, although the paper claims that "All layers (convolution and ESP modules) are followed by a batch normalization [49] and a PReLU [50] non-linearity except for the last point-wise convolution". Is the code correct in stacking conv blocks with only batch norms in between?random note: netParams
instead of
I like to use the following, which you may also prefer to save LOC :)