DDSC Value Error using ResNet101

GeorgeSeif / Semantic-Segmentation-Suite

Semantic Segmentation Suite in TensorFlow. Implement, train, and test new Semantic Segmentation models easily!

2.51k stars 880 forks source link

DDSC Value Error using ResNet101 #152

Open MaxKrapp opened 5 years ago

MaxKrapp commented 5 years ago

Hello I tried to include the model into a data pipeline to get the logits from it. It works super smooth with the other models but when I try the DDSC I got a value error in the concat12 layer. The error tells me that the dimensions do not fit. tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 1 in both shapes must be equal, but are 13 and 9. Shapes are [1,13,13] and [1,9,9]. for 'concat_12' (op: 'ConcatV2') with input shapes: [1,21,21,256], [1,17,17,256], [1,13,13,256], [1,9,9,256], [] and with computed input tensors: input[4] = <-1>.

Is there an issue known with that like is there a resize block missing? Or is my input image to large? 800 x 800. I think it is a logical error somewhere. Have you tested the pipeline before or should I change my frontend?

MaxKrapp commented 5 years ago

So I think there might be an error in your upsampling method. You are using scales from 2 to 8 but these scales are not accurate in my opinion. I scaled always by the largest tensors dimension and then upscaled the last tensor to my input size. The network is now training and I will give a feedback tomorrow.

MaxKrapp commented 5 years ago

Ok the method did run and learn something and the loss values look ok but the inference looks aweful.

TaoZhong11 commented 5 years ago

I have met the problem that shows "F tensorflow/stream_executor/cuda/cuda_dnn.cc:430] could not convert BatchDescriptor {count: 1 feature_map_count: 1024 spatial: 0 0 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX} to cudnn tensor descriptor: CUDNN_STATUS_BAD_PARAM Aborted (core dumped)" when I run the fresh code about DDSC. Have you ever met this?

MaxKrapp commented 5 years ago

No since I have my own pipeline and just using the logits. The problem with this error is that somewhere in the pipeline you get 0 activations and when you try to distribute them on the cuda cores it tells you that there is a bad parameter to distribute. You could try to use larger input size or to skip the last maxpool layer of the resnet. A list of all restnet layers is found in the init of net, init = resNetWhatSoEver(). When you use the activations one layer before you could have gettin rid of this error. An other possibility is that you are using another tf version. I use Linux 1.16 python 3.6. However, the model seems to be ill defined. Since I run into problems with the dimension of the downsampling scales. Hope it helped

MiZhangWhuer commented 5 years ago

So I think there might be an error in your upsampling method. You are using scales from 2 to 8 but these scales are not accurate in my opinion. I scaled always by the largest tensors dimension and then upscaled the last tensor to my input size. The network is now training and I will give a feedback tomorrow.

Hi, can you share the modifications to DDSC.py? How to process the upsample code?