taki0112 / Densenet-Tensorflow

Simple Tensorflow implementation of Densenet using Cifar10, MNIST
MIT License
506 stars 196 forks source link

A bug in transition layer #10

Open NatalieZou opened 6 years ago

NatalieZou commented 6 years ago

the original filter in transition_layer is equal to the growth_k, which is too small ,so the result is not good ,and the network is hard to converge , so I change it as below , referring to another code , and the result is normal now and much more better.
def transition_layer(self, x, scope): with tf.name_scope(scope): x = Batch_Normalization(x, training=self.training, scope=scope+'_batch1') x = Relu(x) shape = x.get_shape().as_list() in_channel = shape[3]

x = conv_layer(x, filter=self.filters, kernel=[1,1], layer_name=scope+'_conv1')

        x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')
        x = Drop_out(x, rate=dropout_rate, training=self.training)
        x = Average_pooling(x, pool_size=[2,2], stride=2)
Abdelpakey commented 6 years ago

Hi NatalieZou I didn't notice that, I will use it and see the the results, could we open a discussion about this code here?

Abdelpakey commented 6 years ago

NatalieZou could you please post your training loss curve here after fixing that bug?

xiongpeng78 commented 6 years ago

thank you for your information ,i will change the parameter, and try it again.

NatalieZou commented 6 years ago

@Abdelpakey ok, of course you can. This is my training acc and loss curve: densenet_acc

densenet_loss

GyuminDev commented 6 years ago

@NatalieZou thank you for your post, can you tell me about the K and L, dropout rate, etc values of your loss curve?

MAGI003769 commented 6 years ago

@GyuminDev In original paper, there is a hyper-parameter called compression factor. This factor is used to decimate the tensor which is feed from a dense block to a transition layer. In my opinion, you can try the K and L values mentioned in paper. The problem in this issue is not so related to these two values. Good luck, hope your model works well.

yuffon commented 5 years ago

Yes, I meet this problem too. Thank you.

yuffon commented 5 years ago

@Abdelpakey ok, of course you can. This is my training acc and loss curve: densenet_acc

densenet_loss

How many densenet blocks do you use? How about the depth and growth rate?

NatalieZou commented 5 years ago

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

yuffon commented 5 years ago

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I? I use tensorflow1.10, cuda 9.0, 1080ti. What optimizer do you use?

NatalieZou commented 5 years ago

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I? I use tensorflow1.10, cuda 9.0, 1080ti. What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

yuffon commented 5 years ago

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I? I use tensorflow1.10, cuda 9.0, 1080ti. What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much. How about the lr decay? I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

NatalieZou commented 5 years ago

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I? I use tensorflow1.10, cuda 9.0, 1080ti. What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much. How about the lr decay? I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

I used MomentumOptimizer and the init_learning_rate is 1e-1

yuffon commented 5 years ago

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I? I use tensorflow1.10, cuda 9.0, 1080ti. What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much. How about the lr decay? I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

I used MomentumOptimizer and the init_learning_rate is 1e-1

Could you please send me a copy of your training script? I really don't know why.

yuffon commented 5 years ago

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I? I use tensorflow1.10, cuda 9.0, 1080ti. What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much. How about the lr decay? I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

I used MomentumOptimizer and the init_learning_rate is 1e-1

I see that your training acc can reach 90% in less than 10 epochs. The test acc also reaches 85% in 10 epochs. This is far beyond the results on my computer.

NatalieZou commented 5 years ago

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I? I use tensorflow1.10, cuda 9.0, 1080ti. What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much. How about the lr decay? I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

I used MomentumOptimizer and the init_learning_rate is 1e-1

I see that your training acc can reach 90% in less than 10 epochs. The test acc also reaches 85% in 10 epochs. This is far beyond the results on my computer.

Could you give me your email, the .py file can't be send in github

yuffon commented 5 years ago

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I? I use tensorflow1.10, cuda 9.0, 1080ti. What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much. How about the lr decay? I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

I used MomentumOptimizer and the init_learning_rate is 1e-1

I see that your training acc can reach 90% in less than 10 epochs. The test acc also reaches 85% in 10 epochs. This is far beyond the results on my computer.

Could you give me your email, the .py file can't be send in github

my email is yuffonzhang@163.com Thanks a lot. The acc really sucks on my computer.

taki0112 commented 5 years ago

@NatalieZou As you said, I fixed the code. @yuffon Please check.

yuffon commented 5 years ago

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I? I use tensorflow1.10, cuda 9.0, 1080ti. What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much. How about the lr decay? I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

I used MomentumOptimizer and the init_learning_rate is 1e-1

I see that your training acc can reach 90% in less than 10 epochs. The test acc also reaches 85% in 10 epochs. This is far beyond the results on my computer.

Could you give me your email, the .py file can't be send in github

my email is yuffonzhang@163.com Thanks a lot. The acc really sucks on my computer.

thank you. I have tried deep densenet on my computer. It reaches 93.78%. But when I use densenet40-12, the acc is not good. Have you tried some shallower densenet(such as densenet40-12)?

yuffon commented 5 years ago

@NatalieZou As you said, I fixed the code. @yuffon Please check.

Thank you very much. I checked the data processing code. The standardization is performed on the total data set. np.mean(x_train[:, :, :, 0]) computes the mean of the first channel of the whole data set. I have tried per image standardization using tf.dataset API, the acc cannot reach 91%. Why?

mbkotori commented 5 years ago

@NatalieZou @taki0112 When I use this code, the program is reported as follows:

Traceback (most recent call last): File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 184, in logits = DenseNet(x=batch_images, nb_blocks=nb_block, filters=growth_k, training=training_flag).model File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 84, in init self.model = self.Dense_net(x) File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 146, in Dense_net x = self.transitionlayer(x, scope='trans'+str(i)) File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 113, in transition_layer x = conv_layer(x, filter=in_channel0.5, kernel=[1,1], layer_name=scope+'_conv1') TypeError: unsupported operand type(s) for : 'Dimension' and 'float'

I think this may be that in_channel is not an integer after 0.5. When I change it to x = conv_layer(x, filter=in_channel1, kernel=[1,1], layer_name=scope+'_conv1'), the network can run.

Then I use Print(x) Print(in_channel) to check their size。 I found that x and in_channel have two values. Tensor("trans_0/Relu:0", shape=(?, 6, 6, 72), dtype=float32) 72 Tensor("trans_1/Relu:0", shape=(?, 3, 3, 120), dtype=float32) 120

From the above it seems that *0.5 should also be satisfied, and x and in_channel should also have only one value? I want to know why this problem happened?

Demons-git commented 5 years ago

@NatalieZou @taki0112 When I use this code, the program is reported as follows:

Traceback (most recent call last): File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 184, in logits = DenseNet(x=batch_images, nb_blocks=nb_block, filters=growth_k, training=training_flag).model File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 84, in init self.model = self.Dense_net(x) File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 146, in Dense_net x = self.transitionlayer(x, scope='trans'+str(i)) File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 113, in transition_layer x = conv_layer(x, filter=in_channel0.5, kernel=[1,1], layer_name=scope+'_conv1') TypeError: unsupported operand type(s) for : 'Dimension' and 'float'

I think this may be that in_channel is not an integer after _0.5. When I change it to x = conv_layer(x, filter=in_channel_1, kernel=[1,1], layer_name=scope+'_conv1'), the network can run.

Then I use Print(x) Print(in_channel) to check their size。 I found that x and in_channel have two values. Tensor("trans_0/Relu:0", shape=(?, 6, 6, 72), dtype=float32) 72 Tensor("trans_1/Relu:0", shape=(?, 3, 3, 120), dtype=float32) 120

From the above it seems that *0.5 should also be satisfied, and x and in_channel should also have only one value? I want to know why this problem happened?

yes,I also have this problem, I run Dense_Cifar10.py, The program reports as follow: Traceback (most recent call last): File "Densenet_Cifar10.py", line 221, in logits = DenseNet(x=input_x, nb_blocks=nb_block, filters=growth_k, training=training_flag).model File "Densenet_Cifar10.py", line 111, in init self.model = self.Dense_net(x) File "Densenet_Cifar10.py", line 180, in Dense_net x = self.transition_layer(x, scope='trans_1') File "Densenet_Cifar10.py", line 140, in transition_layer x = conv_layer(x, filter=in_channel0.5, kernel=[1,1], layer_name=scope+'_conv1') TypeError: unsupported operand type(s) for : 'Dimension' and 'float'

do you solve the problem??

wwptrdo commented 5 years ago

@Demons-git @NatalieZou @taki0112 When I use this code, the program is reported as follows: Traceback (most recent call last): File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 184, in logits = DenseNet(x=batch_images, nb_blocks=nb_block, filters=growth_k, training=training_flag).model File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 84, in init self.model = self.Dense_net(x) File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 146, in Dense_net x = self.transitionlayer(x, scope='trans'+str(i)) File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 113, in transition_layer x = conv_layer(x, filter=in_channel0.5, kernel=[1,1], layer_name=scope+'_conv1') TypeError: unsupported operand type(s) for : 'Dimension' and 'float' I think this may be that in_channel is not an integer after _0.5. When I change it to x = conv_layer(x, filter=in_channel_1, kernel=[1,1], layer_name=scope+'_conv1'), the network can run. Then I use Print(x) Print(in_channel) to check their size。 I found that x and in_channel have two values. Tensor("trans_0/Relu:0", shape=(?, 6, 6, 72), dtype=float32) 72 Tensor("trans_1/Relu:0", shape=(?, 3, 3, 120), dtype=float32) 120 From the above it seems that *0.5 should also be satisfied, and x and in_channel should also have only one value? I want to know why this problem happened?

yes,I also have this problem, I run Dense_Cifar10.py, The program reports as follow: Traceback (most recent call last): File "Densenet_Cifar10.py", line 221, in logits = DenseNet(x=input_x, nb_blocks=nb_block, filters=growth_k, training=training_flag).model File "Densenet_Cifar10.py", line 111, in init self.model = self.Dense_net(x) File "Densenet_Cifar10.py", line 180, in Dense_net x = self.transition_layer(x, scope='trans_1') File "Densenet_Cifar10.py", line 140, in transition_layer x = conv_layer(x, filter=in_channel0.5, kernel=[1,1], layer_name=scope+'_conv1') TypeError: unsupported operand type(s) for : 'Dimension' and 'float'

Hello, I also encountered this problem, I changed this line of code so that I can run it. in_channel = x.shape[-1] Changed to in_channel = x.get_shape().as_list()[-1]

mbkotori commented 5 years ago

@NatalieZou @taki0112 When I use this code, the program is reported as follows: Traceback (most recent call last): File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 184, in logits = DenseNet(x=batch_images, nb_blocks=nb_block, filters=growth_k, training=training_flag).model File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 84, in init self.model = self.Dense_net(x) File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 146, in Dense_net x = self.transitionlayer(x, scope='trans'+str(i)) File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 113, in transition_layer x = conv_layer(x, filter=in_channel0.5, kernel=[1,1], layer_name=scope+'_conv1') TypeError: unsupported operand type(s) for : 'Dimension' and 'float' I think this may be that in_channel is not an integer after _0.5. When I change it to x = conv_layer(x, filter=in_channel_1, kernel=[1,1], layer_name=scope+'_conv1'), the network can run. Then I use Print(x) Print(in_channel) to check their size。 I found that x and in_channel have two values. Tensor("trans_0/Relu:0", shape=(?, 6, 6, 72), dtype=float32) 72 Tensor("trans_1/Relu:0", shape=(?, 3, 3, 120), dtype=float32) 120 From the above it seems that *0.5 should also be satisfied, and x and in_channel should also have only one value? I want to know why this problem happened?

yes,I also have this problem, I run Dense_Cifar10.py, The program reports as follow: Traceback (most recent call last): File "Densenet_Cifar10.py", line 221, in logits = DenseNet(x=input_x, nb_blocks=nb_block, filters=growth_k, training=training_flag).model File "Densenet_Cifar10.py", line 111, in init self.model = self.Dense_net(x) File "Densenet_Cifar10.py", line 180, in Dense_net x = self.transition_layer(x, scope='trans_1') File "Densenet_Cifar10.py", line 140, in transition_layer x = conv_layer(x, filter=in_channel0.5, kernel=[1,1], layer_name=scope+'_conv1') TypeError: unsupported operand type(s) for : 'Dimension' and 'float'

do you solve the problem??

I also change code to run it at same place but not same change. in_channel = x.shape[-1] Changed to in_channel = int(x.shape[-1])

weilanShi commented 5 years ago

@yuffon It's all the same with the original code, the only part I change is "x = conv_layer(x, filter=in_channel*0.5, kernel=[1,1], layer_name=scope+'_conv1')"

Why can't I? I use tensorflow1.10, cuda 9.0, 1080ti. What optimizer do you use?

I used MomentumOptimizer, it's annotated in the code.

thank you very much. How about the lr decay? I have used adam as the optimizer, and the intial lr as 1e-4. This is a bad configuration.

I used MomentumOptimizer and the init_learning_rate is 1e-1

I see that your training acc can reach 90% in less than 10 epochs. The test acc also reaches 85% in 10 epochs. This is far beyond the results on my computer.

Could you give me your email, the .py file can't be send in github

my email is yuffonzhang@163.com Thanks a lot. The acc really sucks on my computer.

Hello, I also encountered a similar trouble that the valid accuracy is not high, I am using Adam optimization, can you send a training code to my mailbox? Wlshi111@aliyun.com Thank you!

weilanShi commented 5 years ago

@NatalieZou @taki0112 When I use this code, the program is reported as follows:

Traceback (most recent call last): File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 184, in logits = DenseNet(x=batch_images, nb_blocks=nb_block, filters=growth_k, training=training_flag).model File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 84, in init self.model = self.Dense_net(x) File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 146, in Dense_net x = self.transitionlayer(x, scope='trans'+str(i)) File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 113, in transition_layer x = conv_layer(x, filter=in_channel0.5, kernel=[1,1], layer_name=scope+'_conv1') TypeError: unsupported operand type(s) for : 'Dimension' and 'float'

I think this may be that in_channel is not an integer after _0.5. When I change it to x = conv_layer(x, filter=in_channel_1, kernel=[1,1], layer_name=scope+'_conv1'), the network can run.

Then I use Print(x) Print(in_channel) to check their size。 I found that x and in_channel have two values. Tensor("trans_0/Relu:0", shape=(?, 6, 6, 72), dtype=float32) 72 Tensor("trans_1/Relu:0", shape=(?, 3, 3, 120), dtype=float32) 120

From the above it seems that *0.5 should also be satisfied, and x and in_channel should also have only one value? I want to know why this problem happened?

I also use * 0.5, but the code doesn't work. You can try filter = in_channel/2. The effect is the same, and print (x) and print (in_channel) are the same.

weilanShi commented 5 years ago

@NatalieZou @taki0112 When I use this code, the program is reported as follows:

Traceback (most recent call last): File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 184, in logits = DenseNet(x=batch_images, nb_blocks=nb_block, filters=growth_k, training=training_flag).model File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 84, in init self.model = self.Dense_net(x) File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 146, in Dense_net x = self.transitionlayer(x, scope='trans'+str(i)) File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 113, in transition_layer x = conv_layer(x, filter=in_channel0.5, kernel=[1,1], layer_name=scope+'_conv1') TypeError: unsupported operand type(s) for : 'Dimension' and 'float'

I think this may be that in_channel is not an integer after _0.5. When I change it to x = conv_layer(x, filter=in_channel_1, kernel=[1,1], layer_name=scope+'_conv1'), the network can run.

Then I use Print(x) Print(in_channel) to check their size。 I found that x and in_channel have two values. Tensor("trans_0/Relu:0", shape=(?, 6, 6, 72), dtype=float32) 72 Tensor("trans_1/Relu:0", shape=(?, 3, 3, 120), dtype=float32) 120

From the above it seems that *0.5 should also be satisfied, and x and in_channel should also have only one value? I want to know why this problem happened?

I also use * 0.5, but the code doesn't work. You can try filter = in_channel/2. The effect is the same, and print (x) and print (in_channel) are the same.

weilanShi commented 5 years ago

@NatalieZou @taki0112 When I use this code, the program is reported as follows:

Traceback (most recent call last): File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 184, in logits = DenseNet(x=batch_images, nb_blocks=nb_block, filters=growth_k, training=training_flag).model File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 84, in init self.model = self.Dense_net(x) File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 146, in Dense_net x = self.transitionlayer(x, scope='trans'+str(i)) File "C:/Users/library/Desktop/densenet/net from git/Densenet-Tensorflow-master/MNIST/Densenet_MNIST.py", line 113, in transition_layer x = conv_layer(x, filter=in_channel0.5, kernel=[1,1], layer_name=scope+'_conv1') TypeError: unsupported operand type(s) for : 'Dimension' and 'float'

I think this may be that in_channel is not an integer after _0.5. When I change it to x = conv_layer(x, filter=in_channel_1, kernel=[1,1], layer_name=scope+'_conv1'), the network can run.

Then I use Print(x) Print(in_channel) to check their size。 I found that x and in_channel have two values. Tensor("trans_0/Relu:0", shape=(?, 6, 6, 72), dtype=float32) 72 Tensor("trans_1/Relu:0", shape=(?, 3, 3, 120), dtype=float32) 120

From the above it seems that *0.5 should also be satisfied, and x and in_channel should also have only one value? I want to know why this problem happened? maybe there are two transition_layer, nb_block = 2

b654381241 commented 5 years ago

sir,I run this example, I found it may be error ,where the error code is " x = conv_layer(x, filter=in_channel 0.5, kernel=[1, 1], layer_name=scope + '_conv1') TypeError: unsupported operand type(s) for : 'Dimension' and 'float' ",i can't solve it,so i want to find some ideas please.