heuritech / convnets-keras

MIT License
594 stars 185 forks source link

Divide alpha by n in LRN layer #6

Closed AlexanderFabisch closed 8 years ago

AlexanderFabisch commented 8 years ago

Hi,

your implementation of the LRN layer seems to be based on this implementation of the local response normalization layer. It contains a bug in comparison to Caffe: alpha is not divided by n as it should be. That would be fixed with this pull request.

leonardblier commented 8 years ago

Hi,

Thank you for pointing this out. In fact, we reproduced this implementation https://github.com/uoguelph-mlrg/theano_alexnet/ There are using this method for the CrossChannelNormalization : https://github.com/lisa-lab/pylearn2/blob/master/pylearn2/expr/normalize.py

class CrossChannelNormalization(object):
    def __init__(self, alpha = 1e-4, k=2, beta=0.75, n=5):
        self.__dict__.update(locals())
        del self.self
        if n % 2 == 0:
            raise NotImplementedError("Only works with odd n for now")

    def __call__(self, c01b):
        half = self.n // 2
        sq = T.sqr(c01b)
        ch, r, c, b = c01b.shape
        extra_channels = T.alloc(0., ch + 2*half, r, c, b)
        sq = T.set_subtensor(extra_channels[half:half+ch,:,:,:], sq)
        scale = self.k
        for i in xrange(self.n):
            scale += self.alpha * sq[i:i+ch,:,:,:]
        scale = scale ** self.beta
        return c01b / scale

I don't think that in in this implementation, alpha is divided by n. Do you agree ? Have you tested it with the normalization ? Are the results better ?

AlexanderFabisch commented 8 years ago

I am sorry, that was my mistake.

I have a network architecture similar to AlexNet trained on another dataset and I tried to transfer the weights learned in Caffe to Keras. I don't know if the results are better but they are more similar to those in Caffe. So it's probably not relevant for you then.

I checked how LRN is described in the original paper (Section 3.3). That seems to be consistent with your implementation. Caffe seems to behave differently then. I think you can close the pull request.

Thanks!