Closed CauchyComplete closed 4 years ago
Hi, Yes you are right. The training of the model dates back two years ago so I cannot say for sure, but I suppose this was an implementation detail I did not mention in the paper because of a lack of space. If I remember, the models were a bit less performant with ReLU, it was a way to gain one or two points of accuracy. As for the general differences between ReLU and LeakyReLU, you may know it's a way to avoid dead neurons for a training with high-variance gradients.
Thank you :D
Hi, it seems that your implementation has a leaky ReLU between two fully connected layers in both Meso4 and MesoInception4. But figure 4 in your paper does not have a leaky ReLU. Why the difference? Thanks