About Skip connection? - Githubissues

XiaotianM commented 7 years ago

Hi, Jin: I am sorry to bother you again. However, I have met a question recently.

In your super_resolution.py file, 227-230 line: Your code:

for i in range(0, self.inference_depth + 1):
    self.Y1_conv[i] = util.conv2d_with_bias(self.H_conv[i], self.WD1_conv, self.cnn_stride, self.BD1_conv,
                                            add_relu=not self.residual, name="Y%d_1" % i)
    self.Y2_conv[i] = util.conv2d_with_bias(self.Y1_conv[i], self.WD2_conv, self.cnn_stride, self.BD2_conv,
                                            add_relu=not self.residual, name="Y%d_2" % i)

However, self.H_conv[0] is in embeding network. And from the authors' papet Figure 3(c). The skip connection should start self_conv[1].

and in 263-271 line, Your code:

for i in range(0, self.inference_depth):
    if self.residual:
        self.Y2_conv[i] = self.Y2_conv[i] + self.x
    inference_sub = tf.subtract(self.y, self.Y2_conv[i], name="Loss1_%d_sub" % i)
    inference_square = tf.square(inference_sub, name="Loss1_%d_squ" % i)
    loss1_mse[i] = tf.reduce_mean(inference_square, name="Loss1_%d" % i)

loss1 = loss1_mse[0]
for i in range(1, self.inference_depth):
    if i == self.inference_depth:
        loss1 = tf.add(loss1, loss1_mse[i], name="loss1")
    else:
        loss1 = tf.add(loss1, loss1_mse[i], name="loss1_%d_add" % i)

only calculate loss from H[0] to H[self.inference_depth-1]. In fact, the loss should calculate from H[1] to H[self.inference_depth].

I think it should change like this. for 220-230 lines:

self.Y1_conv = self.inference_depth * [None]
self.Y2_conv = self.inference_depth * [None]
self.W = tf.Variable(
    np.full(fill_value=1.0 / (self.inference_depth), shape=[self.inference_depth], dtype=np.float32),name="layer_weight")
W_sum = tf.reduce_sum(self.W)

for i in range(0, self.inference_depth):
    self.Y1_conv[i] = util.conv2d_with_bias(self.H_conv[i+1], self.WD1_conv, self.cnn_stride, self.BD1_conv,
                                            add_relu=True, name="Y%d_1" % i)
    self.Y2_conv[i] = util.conv2d_with_bias(self.Y1_conv[i], self.WD2_conv, self.cnn_stride, self.BD2_conv,
                                            add_relu=not self.residual, name="Y%d_2" % i)

And it doesn't need to change for 263-271 line,.

I have got these results: (Set91 aug x4, with residual learning) Set5: 37.04 Set14: 32.57 urban100: 29.58 BSD: 31.41

And the converge is faster, I only use 2.5 hours get the results in NV1080 GPU

XiaotianM commented 7 years ago

Besides, I think self.Y1_conv need to add relu, and the relu in self.Y2_conv should related with residual learning. In fact, the conv operation without relu is a linear operation. Even if you have adopted several conv operation, it is still a linear operation, it may can not make any difference.

What's more, your code style is really beautiful. I have learned a lot :).

jiny2001 commented 7 years ago

Hi, sorry for late reply. I'm having a business trip and is very busy for this week.

Quick reply for you is

However, self.H_conv[0] is in embeding network. And from the authors' papet Figure 3(c). The skip connection should start self_conv[1].

Yeah, this point makes me confused most. Please look at Figure 3 (a), there is a skip connection from Input.

That is why I included H_conv[0] and H[0] for output and loss function.

Now I'm thinking the skip connection from Input may be concatenated to every H_conv[1..depth+1] output. How do you feel about it?

only calculate loss from H[0] to H[self.inference_depth-1]. In fact, the loss should calculate from H[1] to H[self.inference_depth].

Thank you, your words will be true. Let me double check with graph output with tensorboard later.

Anyway, I will check it next week.

Thank you for you feedback and kind words. This code is my almost first python code and is very messy, though, I'm really happy to have your feedbacks. Thx!

XiaotianM commented 7 years ago

ehhhh, I have read a paper from CVPR2017 called 《Image Super-Resolution via Deep Recursive Residual Network》. In this paper, the authors redescibe the DRCN, in their opinions, they think DRCN is look like the following:

How do you feel about it?

jiny2001 commented 7 years ago

Hi,

Yep.

I believe the purple arrow (from the input) would be fed into every black input.

Which is, I already mentioned that purple input is concatenated to every black input.

I wanna implement that, however, I'm working at the other company and have very few time to work with DL.

Wanted to try it this weekend but I couldn't. :(

Best,

Jin

Jin Yamanaka jin.yamanaka@gmail.com

2017-10-30 11:10 GMT+09:00 XiaotianM notifications@github.com:

ehhhh, I have read a paper from CVPR2017 called 《Image Super-Resolution via Deep Recursive Residual Network. In this paper, the authors redescibe the DRCN, in their opinions, the think DRCN is look like the following: [image: 31] https://user-images.githubusercontent.com/29722350/32151791-886cddf4-bd5a-11e7-99dd-8a3e73992784.JPG

How do you feel about it?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jiny2001/deeply-recursive-cnn-tf/issues/7#issuecomment-340325508, or mute the thread https://github.com/notifications/unsubscribe-auth/AMW-cFMA9q0tlIuu2Nb_LCEhvlHU0NVgks5sxTAqgaJpZM4QBYqd .

XiaotianM commented 7 years ago

Oh, never mind. I have learned a lot by discussing with you. Eh, In fact, this implement may be just the same as your implement with the residual connection.

when you feed the input to each black input: it means: w1(y+y1)+w2(y+y2)+.....+wn(y+yn) = (w1+w2+...+wn)y + w1y1 + w2y2+...+wnyn Beacuse you have made normalization, so w1+w2+...+wn = 1; Thus, the equation becomes: = y + w1y1 + w2y2 + ... wnyn

jiny2001 commented 7 years ago

Uhm, No I don't think so. Does the + operator in your equation mean adding the values like residual? My answer is no. ( also the dimension is different so you can't add the value directly? )

I think, not adding the value but adding the feature will be the correct way.

Well, I will try again this weekend since my current experiment will be done by the weekend. :)

XiaotianM commented 7 years ago

uhmm, No, the y1..n in my equation are the output of the reconstruction for each skip connection. Thus, the dimension of y1..n is equal to y, (all they are 1 channel)

I don't think it needs to concatenates the origin input(y) and the immediate features. Of course, it's just my opnion, and I expect your new implement. :)

jiny2001 commented 6 years ago

Oh no, actually, you don't need to close this.

I just refactored my code and removed Y0-node from loss. And now I put input data into the first reconstruction CNN and having some experiments.

Let's see what will happen. :)

jiny2001 commented 6 years ago

Hi,

I tested 2 types of new models.

A: put input signal into the first CNN of each reconstruction layer B: put input signal into the last CNN of each reconstruction layer

And the network graph is like below. (x is the input and Y1-Y9 are the reconstruction layers)

The PSNR results of my experiments are

Before update: Set5, Set14, BSD100: 36.914, 32.577, 31.462 A: Set5, Set14, BSD100: 36.853, 32.464, 31.323 B: Set5, Set14, BSD100: 37.054, 32.572, 31.451

We must do at least 3 or 5 experiments to evaluate the performance, however, it looks B is enough good and I believe B can be the same structure with the one you put in this thread.

So I updated the GitHub. Later, I will upload the weight trained with x4 augmented (Yang91+General100). Any comments is appreciated. Thx!

jiny2001 commented 6 years ago

Hi,

Thank you for your comments.

I trained new model with larger training dataset (Yang91+general100)x4, and found I could get better results.

Set5 37.24 -> 37.37 Set14 32.77 -> 32.88 BSD100 31.64 -> 31.73 Urban100 29.78 -> 30.03

I updated readme and uploaded new weights. Thank you so much, again!

Jin

jiny2001 / deeply-recursive-cnn-tf

About Skip connection? #7