AlexanderMath / itf

Library for Normalizing Flow in TensorFlow 2.0.
1 stars 0 forks source link

AffineCouplingLayer breaks under some shapes #1

Open anshuln opened 5 years ago

anshuln commented 5 years ago

In the call function of the affine coupling layer ` def call(self, X):

    in_shape = tf.shape(X)
    n, h, w, c = X.shape

    for layer in self.layers: 
        X = layer.call(X) # residual 

    X = tf.reshape(X, (-1, h, w, c*2))
    s = X[:, :, w//2:, :]
    t = X[:, :, :w//2, :]  

    s = tf.reshape(s, in_shape)
    t = tf.reshape(t, in_shape)

    return s, t

Shouldn't s and t be X[:,:,:,c:] and X[:,:,:,:c] respectively? My code-

`   data = np.random.normal(0,1,(1,6,6,4)).astype('f')
    a = FlowSequential()      #This is similar to a sequential model
    b = AffineCoupling(part=0)
    b.add(Conv2D(64, kernel_size=(3,3), activation="relu"))
    b.add(Flatten())
    b.add(Dense(50,activation='relu'))
    b.add(Dense(6*6*4,activation='relu'))
    a.add(Squeeze())
    a.add(b)
        a.call(data)
`

Gives the error in call_ s = tf.reshape(s, in_shape) tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 96 values, but the requested shape has 72

AlexanderMath commented 5 years ago

Shouldn't s and t be X[:,:,:,c:] and X[:,:,:,:c] respectively?

Good question. In brevity, not necessarily, however, I think it would be preferable. I'll update it after I finish cleaning up the variational dequantization code later today. I might add a strategy class that allow different ways of choosing s and t, see full answer below for reasons why.

The full answer. Consider the table from Glow describing their affine coupling layer:

image

In Glow the split is done channel-wise. AffineCoupling layer takes an argument strategy which should be a class that implements the splitting functionality. By default AffineCoupling uses the strategy from Glow. This means that x_a and x_b both have size h x w x c//2, assuming x has size h x w x c.

The outputs s and t of the neural network then also needs to have the same size h x w x c//2 such that the shapes match in y_a=s\odot x_a + t. The straight forward way of doing this would be to select s and t as you suggest. However, it is only required that they each have the right number of elements before we reshape them to have the right shape.

I think I was experimenting with a few different ways of doing it, which yielded the code you saw. Currently, I'm actually thinking of delegating this responsibility to another strategy class. The main argument for this: It is not clear to me which way of splitting the output of the NN into s and t is most desirable, in particular, it is not clear to me that this couldn't effect performance; I thus want to run experiments with different techniques when the code is done.

Surprisingly, I could not reproduce the bug, the code below works for me. Does the following work for you?

from invtf import Generator
from invtf.layers import AffineCoupling, Squeeze
from tensorflow.keras.layers import Conv2D, Flatten, Dense

a = Generator()

b = AffineCoupling(part=0)
b.add(Conv2D(64, kernel_size=(3,3), activation="relu"))
b.add(Flatten())
b.add(Dense(50,activation='relu'))
b.add(Dense(6*6*4,activation='relu'))

a.add(Squeeze())

a.add(b)
anshuln commented 5 years ago

The code still doesn't work. After constructing the network, I added a.call(data) , and the call broke the code, giving the same error as above. The main issue is when w is odd, b*h*(w//2)*(2*c) is not equal to b*h*w*c.

AlexanderMath commented 5 years ago

I changed the code to reflect the strategy of glow. They take odd/even channels instead of splitting halfway. I'll refactor later to allow both strategies.

# approach in glow https://github.com/openai/glow/blob/eaff2177693a5d84a1cf8ae19e8e0441715b82f8/model.py#L376
h = f("f1", z1, hps.width, n_z) # this is the output of the conv net
shift = h[:, :, :, 0::2] # this is our t
scale = tf.nn.sigmoid(h[:, :, :, 1::2] + 2.) # this is our s

The code below works on my pc with tf2.0.0beta (gpu variant) and python3.7.3.

from invtf import Generator
from invtf.layers import AffineCoupling, Squeeze
from tensorflow.keras.layers import Conv2D, Flatten, Dense
import numpy as np
import tensorflow.keras  as keras 

data = np.random.normal(0,1,(1,6,6,4)).astype(np.float32)

a = Generator()

a.add(keras.layers.InputLayer(data.shape[1:]))

b = AffineCoupling(part=0)
b.add(Conv2D(64, kernel_size=(3,3), activation="relu"))
b.add(Flatten())
b.add(Dense(50,activation='relu'))
b.add(Dense(6*6*4,activation='relu'))

a.add(Squeeze())

a.add(b)

a.predict(data)