on templates and batch_normalize

thjashin commented 8 years ago

Hi,

I'm really happy with prettytensor's templates and unbounded variable, which makes it different from other wrappers built upon tensorflow, so i'm using it a lot in my project. There are several questions I have now that I could not figure out by reading through the source, so I'm trying to ask here. 1) When using templates, I wonder whether the default_scope parameters are passed when building the template or when calling construct? Do I need to include the template.construct() code into the default scope? 2) What's the recommended use of batch_normalize in prettytensor? I know of several ways to do it.

by setting batch_normalize=True in the default scope (seems this doesn't work for fully_connect?)
by passing batch_normalize=True to each convolutional layer (seems doesn't work for fully_connect either?)
by explicitly calling .batch_normalize() after last layer. (this seems to work both for conv and fully_connected layers)

3) The commonly recommended use of batch normalization is to add it between the linear transformation and the activation function. But during my reading the code, I believe prettytensor is adding BN after the activation function, right?

eiderman commented 8 years ago

Hi, thanks for the question.

The typical way of doing of applying batch normalization is using the defaults_scope, but sometimes you need more control over it and so the other ways are provided. In particular, you may decide to only apply it to certain layers for efficiency.

The default parameters are supplied at the time of template construction, but any of the defaults can be UnboundVariables.

I found the problem in fully_connected and you are completely correct that it is placed in the wrong place. I will create a quick patch for it. conv2d doesn't have this bug.

thjashin commented 8 years ago

Thanks for reply, and how about question 3)? Should I use .conv2d(..., activation_fn=None).batch_normalize().apply(tf.nn.relu) instead of .conv2d(...).batch_normalize()

eiderman commented 8 years ago

What you outlines will work but the answer to 3 was:

I found the problem in fully_connected and you are completely correct that it is placed in the wrong place. I will create a quick patch for it. conv2d doesn't have this bug.

google / prettytensor

on templates and batch_normalize #32