Why not try with BatchNorm?

faustomilletari / VNet

GNU General Public License v3.0

286 stars 122 forks source link

Why not try with BatchNorm? #43

Closed John1231983 closed 7 years ago

John1231983 commented 7 years ago

BatchNorm-Scale-ReLU is common setting now. I looked at your VNet prototxt and found that you did not use BatchNorm layer (UNet used it). Instead of it, you used PReLU. Did you try to compare the performance of three settings: (1) BatchNorm-Scale-ReLU, (2) BatchNorm-Scale-PReLU and (3) PReLU only? I want to try BatchNorm-Scale-PReLU, but I am not sure it can make any benefit. Thanks

faustomilletari commented 7 years ago

Batches are very small here. I was afraid of instability. You can surely try!

Regards,

Fausto Milletarì Sent from my iPhone

On 2. Aug 2017, at 07:36, John1231983 notifications@github.com wrote:

BatchNorm-Scale-ReLU is common setting now. I looked at your VNet prototxt and found that you did not use BatchNorm layer (UNet used it). Instead of it, you used PReLU. Did you try to compare the performance of three settings: (1) BatchNorm-Scale-ReLU, (2) BatchNorm-Scale-PReLU and (3) PReLU only? I want to try BatchNorm-Scale-PReLU, but I am not sure it can make any benefit. Thanks

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

John1231983 commented 7 years ago

You are right for instability point. I am using batch size as 4 and sometimes the loss increases as a sharp point. I think the reason as your mention. I am using setting BN-Scale-ReLU. In summary, if I have larger batch size, such as 16, then I can get benefit with setting BN-Scale-PReLU. Am I right?

faustomilletari commented 7 years ago

Yes. Otherwise there are newer batch normalization papers that propose ways to deal with small batches. There was also this weight normalization paper from Tim salisman from open ai proposing to do weight normalization achieving a similar effect.

Regards,

Fausto Milletarì Sent from my iPhone

On 2. Aug 2017, at 09:15, John1231983 notifications@github.com wrote:

You are right for instability point. I am using batch size as 4 and sometimes the loss increases as a sharp point. I think the reason as your mention. I am using setting BN-Scale-ReLU. In summary, if I have larger batch size, such as 16, then I can get benefit with setting BN-Scale-PReLU. Am I right?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

John1231983 commented 7 years ago

Could you give me the title of BN paper with small batches? Is it https://arxiv.org/abs/1702.03275?

faustomilletari commented 7 years ago

Yes

Fausto Milletarì Sent from my iPhone

On 2. Aug 2017, at 09:23, John1231983 notifications@github.com wrote:

Could you give me the title of BN paper with small batches? Is it https://arxiv.org/abs/1702.03275?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.