Open ruiann opened 7 years ago
Or I say as VGG19 contains 16 convolutional layer & 5 max pooling layer, the paper want to use the convoluted artifacts to make the content loss? So you need to process the input image to VGG19 and get the middle creature to do the test?
And also I can see your VGG19 doesn't exactly follow the conv net configurations given by
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
could you plz tell the reason
You are right. SRGAN uses the VGG19 to make the content loss. It inputs the real and fake images to the VGG19 network and compares among feature maps obtained within it.
And also I can see your VGG19 doesn't exactly follow the conv net configurations given by VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION could you plz tell the reason
I'm sorry but it's my mistake. I built the conv network roughly. You should build the VGG19 exactly. But I suppose that doesn't count much since the conv network is used just for obtaining the feature maps.
It looks like you get all the feature maps generated by every layer of your 6-layer VGG19, but as the paper said to obtain the feature map obtained by the j-th convolution (after activation) before the i-th maxpooling layer within the VGG19 network, I suppose that you may make some mistake in the generation of phi, or it's my mistake for misunderstanding the meaning of phi. In fact I don't really understand the meaning of i and j of the definition of phi in that paper, but as the paper said, you can choose different i and j, for example phi_2_2 & phi_5_4, so I still wonder what does it means for phi_i_j
As you said, I get all the feature maps generated by every layer (i.e. phi_1_2, phi_2_2, phi_3_4, phi_4_4, and phi_5_4 within the VGG19 network). I also tried to do the training with one feature phi_5_4 but it does not go well.
Yeah I know what phi_i_j means finally. Why it doesn't work? and also you can use pre trained VGG19 models instead like this, I think it may works and no need to re train the VGG19
I tried to train the VGG19 and SRGAN again, but it doesn't work. It is blur (like scale) and the edges have no color.
with 1 feature map (phi_5_4)
with 5 feature maps (phi_1_2, phi_2_2, phi_3_4, phi_4_4, and phi_5_4)
I guess for small pics you'd better use phi_2_2, since the phi_5_4, which describe the global feature map, is a small tensor for low resolution pics
just a guess
It looks like you have made something wrong for loss function, why do you use L2 normalization? The content loss & generation loss seems differs a lot from the paper
This implementation adopts the least squares loss function instead of the sigmoid cross entropy loss function for the discriminator. cf. Squares Generative Adversarial Networks
The results seem not bad. But I haven't evaluated them yet.
I havn't read that paper but I think something maybe wrong with your d_loss
d_loss try to make true_output near 1 & fake_output near 0, I cannot understand your meaning by
d_loss_fake = tf.reduce_mean(tf.nn.l2_loss(fake_output + tf.ones_like(fake_output)))
I think for least squares, it would be
d_loss_fake = tf.reduce_mean(tf.nn.l2_loss(tf.zeros_like(fake_output) - fake_output))
I use
alpha = 1e-3
g_loss = tf.nn.sigmoid_cross_entropy_with_logits(logits=fake_output, labels=tf.ones_like(fake_output))
d_loss_true = tf.nn.sigmoid_cross_entropy_with_logits(logits=true_output, labels=tf.ones_like(true_output))
d_loss_fake = tf.nn.sigmoid_cross_entropy_with_logits(logits=fake_output, labels=tf.zeros_like(fake_output))
d_loss = (d_loss_true + d_loss_fake) / 2
as the output tensor of shape [batch_size] havn't gone through sigmoid, I think it may be work. But maybe I'm wrong as I havn't read that paper
Thank you. I've made a mistake. I'll deal with it within a few days.
@tadax can you tell me your content loss range ? I try to use the pretrained VGG19 linked in above posts, the content loss can up to 1e6 and totally guide the gradient ascent.
The content loss should be less than 1e4 in the beginning.
It may be a good idea to use batch normalization in the VGG19 network.
Thx, and can you tell me the device of your training machine & how long does it cost to convergent, my training task seems hard to go it, I've run 5 epochs but cannot get a good result
My result is not good too, there are some bad patches in the result just like the images putting on the author's web.
@ruiann I fixed bugs and trained it again. I run it on GeForce GTX 1070 with CUDA 8/TensorFlow 1.1 (ubuntu 16.04). Running 20 epochs takes about 100 minutes.
Can you please tell me how you have decided the alpha multiplier in loss functions, also during iteration steps the losses doesn't does not seem to reduce, Can you give an estimate of how may epochs to run for getting a good low generator loss
Unfortunately, it is a rule of thumb and I can't decide to do learning rate decay for ADAM.
I find that you have not modify the mistake as @ruiann mentions
I tried to train the VGG19 and SRGAN again, but it doesn't work. It is blur (like scale) and the edges have no color.
with 1 feature map (phi_5_4)
with 5 feature maps (phi_1_2, phi_2_2, phi_3_4, phi_4_4, and phi_5_4)
I tried to train the VGG19 and SRGAN again, but it doesn't work. It is blur (like scale) and the edges have no color.
with 1 feature map (phi_5_4)
with 5 feature maps (phi_1_2, phi_2_2, phi_3_4, phi_4_4, and phi_5_4)
I think this code have some issues which about color
def save_img(imgs, label, epoch):
for i in range(batch_size):
fig = plt.figure()
for j, img in enumerate(imgs):
im = np.uint8((img[i]+1)*127.5) # image pixel perhaps over 255
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
use tanh function, also preprocesing that normalize to [-1,1], this can solve color problem
with tf.variable_scope('deconv5'):
x = deconv_layer(x,[3,3,3,16],[self.batch_size, 96,96,3],1)
x = tf.nn.tanh(x)
other, I find a mistake and modify as follow:
def inference_adversarial_loss_with_sigmoid(real_output, fake_output):
alpha = 1e-3
g_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=tf.ones_like(fake_output),
logits=fake_output))
d_loss_real = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=tf.ones_like(real_output),
logits=real_output))
d_loss_fake = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=tf.zeros_like(fake_output),
logits=fake_output))
d_loss = d_loss_real + d_loss_fake
return (g_loss*alpha, d_loss*alpha)
Hi @tadax @guker after downloading the pretrained model VGG19, where to put the pretrained model and how to load it to restore the VGG19? Thank you!!!
As I still don't really understand how can i know the phi i,j as the paper described, so I searched your code and take a glimpse at your loss function & VGG definition, I don't really understand what do you mean by your phi defined in VGG build_model function, or actually say that I don't understand what does the paper mean
indicate the feature map obtained by the j-th convolution (after activation) before the i-th maxpooling layer within the VGG19 network
even I know that they want to calculate the texture content loss by feature map.
So could you plz just help me, really thanks to you any way as you have implement the paper's net