google / prettytensor

Pretty Tensor: Fluent Networks in TensorFlow
1.24k stars 151 forks source link

Bug with template, phase=Unbound and batch_normalize=True #50

Open RuohanW opened 7 years ago

RuohanW commented 7 years ago

Hi,

there appears to be a bug with the combination of template, batch_normalize=True, and phase=Unbound. I believe this is a valid use case for constructing graphs. I understand a similar issue had been raised before, but I'm quite positive about the issue.

Code to reproduce the bug:

import prettytensor as pt
import tensorflow as tf

with pt.defaults_scope(activation_fn=tf.nn.relu, phase=pt.UnboundVariable('phase'), batch_normalize=True):
    out=pt.template('input').conv2d(4, 32, stride=2, name='conv1', bias=None)

test=out.construct(phase=pt.Phase.train, input=tf.placeholder(tf.float32, [20, 16, 16, 1]))

After stepping through the executions, I believe that the bug originates from the following interaction: As the conv layer is being constructed, the line

y = pretty_tensor_normalization_methods.batch_normalize_with_arguments(
        y, batch_normalize)

within the conv2d method is being called, as there is a default unbounded variable in the chainDict, the layer constructed by the batch_normalization method is a deferred layer, instead of a layer object. This causes any subsequent activation_fn to fail.

I have implemented a local fix to add the conv layer's phase information into batch_normalize argument. Essentially, if batch_normalize is True or a BatchNormalzationArgument without phase info, I would pass in the phase info from the conv layer's arguments.

Could anyone verify the issue and my proposed fix? I could create a pull-request once the issue is confirmed.

Thanks a lot