Small changes for faster convergence

NikolasMarkou commented 7 years ago

Removed redundant relus Replaced relus with prelus with very small constant multiplier. Haven't tested on imagenet but on all my classification tests it converged faster with slightly higher accuracy usually ~ >0.5%

forresti commented 7 years ago

Interesting! Nice work.

In order to upstream this, I would like:

include an ImageNet trained model
name this SqueezeNet v1.11

NikolasMarkou commented 7 years ago

That is ImageNet 2011, right ?

forresti commented 7 years ago

2012

On Tue, Feb 14, 2017 at 12:22 AM Nikolas Markou notifications@github.com wrote:

That is ImageNet 2011, right ?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/DeepScale/SqueezeNet/pull/35#issuecomment-279638176, or mute the thread https://github.com/notifications/unsubscribe-auth/AB7SqkxZF8CPlDExK4oNScjZX7IHEkg7ks5rcWRagaJpZM4L_xPM .

NikolasMarkou commented 7 years ago

Great, I'll get on to it, in the meantime it just occured to me after moving the relu/prelus that it can go a step further , since pooling is of type max everywhere there is no problem moving the relus/prelus after the pools and thus saving flops on that too and remaining equivelant to the original squeezenet v1.1

this => layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" convolution_param { num_output: 64 kernel_size: 3 stride: 2 weight_filler { type: "xavier" } } } layer { name: "relu_conv1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "pool1" type: "Pooling" bottom: "conv1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } }

transforming to this layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" convolution_param { num_output: 64 kernel_size: 3 stride: 2 weight_filler { type: "xavier" } } } layer { name: "pool1" type: "Pooling" bottom: "conv1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "relu_conv1" type: "ReLU" bottom: "conv1" top: "conv1" }

TechnikEmpire commented 5 years ago

Has imagenet been trained with this yet?

forresti / SqueezeNet

Small changes for faster convergence #35