tadax / srgan

SRGAN implemetation with TensorFlow
83 stars 31 forks source link

How to get the VGG54 as the content loss as the paper said #4

Open ruiann opened 7 years ago

ruiann commented 7 years ago

As I still don't really understand how can i know the phi i,j as the paper described, so I searched your code and take a glimpse at your loss function & VGG definition, I don't really understand what do you mean by your phi defined in VGG build_model function, or actually say that I don't understand what does the paper mean

indicate the feature map obtained by the j-th convolution (after activation) before the i-th maxpooling layer within the VGG19 network

even I know that they want to calculate the texture content loss by feature map.

So could you plz just help me, really thanks to you any way as you have implement the paper's net

ruiann commented 7 years ago

Or I say as VGG19 contains 16 convolutional layer & 5 max pooling layer, the paper want to use the convoluted artifacts to make the content loss? So you need to process the input image to VGG19 and get the middle creature to do the test?

ruiann commented 7 years ago

And also I can see your VGG19 doesn't exactly follow the conv net configurations given by

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

could you plz tell the reason

tadax commented 7 years ago

You are right. SRGAN uses the VGG19 to make the content loss. It inputs the real and fake images to the VGG19 network and compares among feature maps obtained within it.

And also I can see your VGG19 doesn't exactly follow the conv net configurations given by VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION could you plz tell the reason

I'm sorry but it's my mistake. I built the conv network roughly. You should build the VGG19 exactly. But I suppose that doesn't count much since the conv network is used just for obtaining the feature maps.

ruiann commented 7 years ago

It looks like you get all the feature maps generated by every layer of your 6-layer VGG19, but as the paper said to obtain the feature map obtained by the j-th convolution (after activation) before the i-th maxpooling layer within the VGG19 network, I suppose that you may make some mistake in the generation of phi, or it's my mistake for misunderstanding the meaning of phi. In fact I don't really understand the meaning of i and j of the definition of phi in that paper, but as the paper said, you can choose different i and j, for example phi_2_2 & phi_5_4, so I still wonder what does it means for phi_i_j

tadax commented 7 years ago

As you said, I get all the feature maps generated by every layer (i.e. phi_1_2, phi_2_2, phi_3_4, phi_4_4, and phi_5_4 within the VGG19 network). I also tried to do the training with one feature phi_5_4 but it does not go well.

ruiann commented 7 years ago

Yeah I know what phi_i_j means finally. Why it doesn't work? and also you can use pre trained VGG19 models instead like this, I think it may works and no need to re train the VGG19

tadax commented 7 years ago

I tried to train the VGG19 and SRGAN again, but it doesn't work. It is blur (like scale) and the edges have no color.

001 with 1 feature map (phi_5_4)

002

with 5 feature maps (phi_1_2, phi_2_2, phi_3_4, phi_4_4, and phi_5_4)

ruiann commented 7 years ago

I guess for small pics you'd better use phi_2_2, since the phi_5_4, which describe the global feature map, is a small tensor for low resolution pics

just a guess

ruiann commented 7 years ago

It looks like you have made something wrong for loss function, why do you use L2 normalization? The content loss & generation loss seems differs a lot from the paper

tadax commented 7 years ago

This implementation adopts the least squares loss function instead of the sigmoid cross entropy loss function for the discriminator. cf. Squares Generative Adversarial Networks

The results seem not bad. But I haven't evaluated them yet.

ruiann commented 7 years ago

I havn't read that paper but I think something maybe wrong with your d_loss

d_loss try to make true_output near 1 & fake_output near 0, I cannot understand your meaning by

d_loss_fake = tf.reduce_mean(tf.nn.l2_loss(fake_output + tf.ones_like(fake_output)))

I think for least squares, it would be

d_loss_fake = tf.reduce_mean(tf.nn.l2_loss(tf.zeros_like(fake_output) - fake_output))

I use

alpha = 1e-3
g_loss = tf.nn.sigmoid_cross_entropy_with_logits(logits=fake_output, labels=tf.ones_like(fake_output))
d_loss_true = tf.nn.sigmoid_cross_entropy_with_logits(logits=true_output, labels=tf.ones_like(true_output))
d_loss_fake = tf.nn.sigmoid_cross_entropy_with_logits(logits=fake_output, labels=tf.zeros_like(fake_output))
d_loss = (d_loss_true + d_loss_fake) / 2

as the output tensor of shape [batch_size] havn't gone through sigmoid, I think it may be work. But maybe I'm wrong as I havn't read that paper

tadax commented 7 years ago

Thank you. I've made a mistake. I'll deal with it within a few days.

ruiann commented 7 years ago

@tadax can you tell me your content loss range ? I try to use the pretrained VGG19 linked in above posts, the content loss can up to 1e6 and totally guide the gradient ascent.

tadax commented 7 years ago

The content loss should be less than 1e4 in the beginning.

It may be a good idea to use batch normalization in the VGG19 network.

ruiann commented 7 years ago

Thx, and can you tell me the device of your training machine & how long does it cost to convergent, my training task seems hard to go it, I've run 5 epochs but cannot get a good result

DunguLock commented 7 years ago

My result is not good too, there are some bad patches in the result just like the images putting on the author's web.

tadax commented 7 years ago

@ruiann I fixed bugs and trained it again. I run it on GeForce GTX 1070 with CUDA 8/TensorFlow 1.1 (ubuntu 16.04). Running 20 epochs takes about 100 minutes.

joydeepdas commented 7 years ago

Can you please tell me how you have decided the alpha multiplier in loss functions, also during iteration steps the losses doesn't does not seem to reduce, Can you give an estimate of how may epochs to run for getting a good low generator loss

tadax commented 7 years ago

Unfortunately, it is a rule of thumb and I can't decide to do learning rate decay for ADAM.

guker commented 5 years ago

I find that you have not modify the mistake as @ruiann mentions

guker commented 5 years ago

I tried to train the VGG19 and SRGAN again, but it doesn't work. It is blur (like scale) and the edges have no color.

001 with 1 feature map (phi_5_4)

002

with 5 feature maps (phi_1_2, phi_2_2, phi_3_4, phi_4_4, and phi_5_4)

I tried to train the VGG19 and SRGAN again, but it doesn't work. It is blur (like scale) and the edges have no color.

001 with 1 feature map (phi_5_4)

002

with 5 feature maps (phi_1_2, phi_2_2, phi_3_4, phi_4_4, and phi_5_4)

I think this code have some issues which about color

 def save_img(imgs, label, epoch):
        for i in range(batch_size):
            fig = plt.figure()
            for j, img in enumerate(imgs):
               im = np.uint8((img[i]+1)*127.5) # image pixel perhaps over 255
               im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)

use tanh function, also preprocesing that normalize to [-1,1], this can solve color problem

   with tf.variable_scope('deconv5'):
           x = deconv_layer(x,[3,3,3,16],[self.batch_size, 96,96,3],1)
           x = tf.nn.tanh(x)  

other, I find a mistake and modify as follow:

  def inference_adversarial_loss_with_sigmoid(real_output, fake_output):
            alpha = 1e-3
            g_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=tf.ones_like(fake_output),
                                                            logits=fake_output))
            d_loss_real = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=tf.ones_like(real_output),
                                                                 logits=real_output))
            d_loss_fake = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=tf.zeros_like(fake_output),
                                                                 logits=fake_output))
            d_loss = d_loss_real + d_loss_fake
            return (g_loss*alpha, d_loss*alpha)
wqz960 commented 4 years ago

Hi @tadax @guker after downloading the pretrained model VGG19, where to put the pretrained model and how to load it to restore the VGG19? Thank you!!!