ShichenLiu / CondenseNet

CondenseNet: Light weighted CNN for mobile devices
MIT License
694 stars 131 forks source link

Question on clamp #26

Closed lizhenstat closed 5 years ago

lizhenstat commented 5 years ago

Hi, I have a question on clap weights https://github.com/ShichenLiu/CondenseNet/blob/master/layers.py#L125

weight = weight.sum(0).clamp(min=1e-6).sqrt()

I don't understand the clamp function here. I tried to train condensenet-86 on cifar10 . with and without clamp functions with clamp: error rate = 95.06 without clamp: error rate = 94.96

Thanks in advance

ShichenLiu commented 5 years ago

Hi,

The clamp function here is to prevent numerical issues. Since the derivative of x^p is p*x^(p-1), when p < 1, it is essential to prevent x^(p-1) to be too big which introduces numerical unstable issues.

lizhenstat commented 5 years ago

@ShichenLiu Oh! I got it, thanks a lot