Closed DmitryGeyzersky closed 5 years ago
Thank you ! You are supposed to get 1,480,480,1 as the shape of prediction P. Can you print and see what values you are getting ? It should be an array of 0's and 1's .
As you correctly mentioned, I get [1,480,480,1] shape of floats: The values of P look like this:
[[[[6.03704466e-05] [1.19896795e-05] [5.36343805e-07] ... [1.02868466e-06] [8.89895182e-06] [1.73807945e-04]]
[[6.74800049e-06] [6.84128167e-07] [1.03380158e-08] ... [1.15038617e-07] [7.39649067e-07] [8.81992946e-06]]
[[3.98591027e-07] [2.09695408e-08] [3.93589716e-10] ... [4.13788115e-10] [4.96457169e-08] [8.03070520e-07]]
...
[[2.29515699e-05] [3.05382827e-07] [5.80548720e-09] ... [1.86769222e-09] [8.23730133e-08] [2.39246970e-06]]
[[5.72312892e-05] [4.32508978e-06] [6.39867039e-08] ... [9.59735544e-08] [3.48175922e-06] [8.07032338e-05]]
[[7.37155962e-04] [3.83935949e-05] [2.09957511e-06] ... [5.02240709e-06] [5.81189779e-05] [3.51740950e-04]]]]
Try this for one image Pass the image , get the result
Im = cv2.imread(XYZ)
P = prediction from network
Select any one channel from image
im[:,:,0] = im[:,:,0] * np.squeeze(P)
plt.imshow(im[:,:,0] )
plt.show()
create_labels.py has the following code:
GT = cv2.imread(path_to_labels+al,0) GT[GT!=255]=0 cl = cv2.imwrite(path_to_new_labels+al,GT)
but the original contours in VOC2012 dataset has the value of 220, so used cv2.threshold function instead. I guess this might be the cause of confusion. Anyway, I rebuild the labels to have only 0 and 1 values and I'm trying to retrain the model. After the training is done I'll reevaluate the results and check your suggestion. Thank you for your prompt responses!
I'll keep you posted on the progress.
Yeah that should solve the issue ! Let me know how it works out
I tried to train the model with labels created as following:
alls = os.listdir(path_to_labels) for al in alls: GT = cv2.imread(path_to_labels+al,0) GT[GT != 220] = 0 GT[GT == 220] = 255 cl = cv2.imwrite(path_to_new_labels+al,GT)
SO the label looks OK now with values in (0,255):
Whe I run training with 100 epochs it looks like the model doesn't converge: and the result is suboptimal:
If I change the create_label.py to my original version:
alls = os.listdir(path_to_labels) for al in alls: GT = cv2.imread(path_to_labels+al,0) ret, thresh = cv2.threshold(GT, 127, 255, cv2.THRESH_BINARY) GT[thresh < 255] = 0 cl = cv2.imwrite(path_to_new_labels+al,GT)
the model converges:
but the result is black. Any help will be greatly appreciated!
@getmeIns Can you visualize the label and send it here after this
alls = os.listdir(path_to_labels) for al in alls: GT = cv2.imread(path_to_labels+al,0) ret, thresh = cv2.threshold(GT, 127, 255, cv2.THRESH_BINARY) GT[thresh < 255] = 0 cl = cv2.imwrite(path_to_new_labels+al,GT)
Also the model is not converging in second case , it is one of the many cases of training during which loss falls to 0 and model fails to learn anything . In case of a proper learning , you will find a gradual loss profile. So my guess is labels aren't as they are supposed to be.
Sure. Here you go: but there are also things like this:
I guess you are right and the problem has something to do with labels.
Your original code:
GT = cv2.imread(path_to_labels+al,0) GT[GT!=255]=0
doesn't seem to do the job since the white pixels are 220. Did you do any preprocessing to the original SegmentationObject files?
Try this::: GT = cv2.imread(path_to_labels+al,0) GT[GT!=220]=0 And print the max/min values of labels np.max(labels) np.min(labels)
You previously mentioned the output result should be in range of (0,1). Should I divide the values by 255 at some point?
max: 220 min: 0
You previously mentioned the output result should be in range of (0,1). Should I divide the values by 255 at some point?
I have already included that in my code
So now you have to do two things ::
GT[GT!=220]=0 use this code in create labels.py instead of this GT[GT!=255]=0
And in train.py change line 118 with
labs_person = labs_person/220.0
This should solve the issue!!
Thank you so much! I'll give it a try and post you after the training is complete.
After the change, unfortunately, the result is still the same. The values returned from the model are floats like these: [9.10041820e-08] [7.19515185e-08] [7.40636636e-08] [4.71445851e-08] [1.00155013e-07] [7.48428590e-08] [1.19182850e-07] [1.36910415e-07] [1.58649627e-07] [1.46840435e-07] and the result accordingly is black. The loss function this time gradually decreased as expected.
Following is a summary of the changes I've made to the original code for your review:
create_labels.py:
GT[GT != 220] = 0
utils.py
removed the default value of ignore_label and passed a value of 220
def random_crop_and_pad_image_and_labels(image, label, crop_h, crop_w, ignore_label)
removed the parameter ignore_label=255 which was not used in the function:
def random_crop_and_pad_image(image, crop_h, crop_w):
train.py
passed a value of 220 to the function instead of default 255:
line 73: image,label = random_crop_and_pad_image_and_labels(tf.squeeze(image_ph),tf.squeeze(label_ph,axis=0),size,size, 220)
used axis parameter instead of deprecated dim (should be equivalent)
line 75: norm_image = tf.expand_dims(norm_image,axis=0)
line 123: labs_person = labs_person/220
eval.py
added:
save_preds = FLAGS.save_preds
changed line 39 to: size = FLAGS.eval_crop_size
changed line 40 to: image = random_crop_and_pad_image(tf.squeeze(image_ph),size,size)
changed line 42 to: norm_image = tf.expand_dims(norm_image,axis=0)
changed lines 57-60 to:
input_image = cv2.imread(Image_directory + l + '.jpg')
input_image = cv2.resize(input_image, (size,size))
feed_dict = {image_ph:np.reshape(input_image, [1,size,size,3])}
P = sess.run(pred, feed_dict=feed_dict)
Any ideas?
Can you save this numpy array and attach it here ?
Please see attached the value of P:
yeah even i see a black result . Can you try passing a single image again and again and see what the result is like?
Also send me that image and corresponding label here so that i can try the same and figure out the issue
Thanks for helping, much appreciated! I tried what you suggested but the result is still the same (the same min and max values). Please see attached the image I used (from VOC2012, the same one you used in the article) and the label I generated for it:
I would kindly ask you to compare the version of code you published with what you used for training the model (just in case you commented some things for debugging or by mistake).
Kindest regards!
Is there anything else I can provide to help you figure out the problem?
Hi @getmeIns , i will get back to you real soon , really sorry about the delay.
Hi, did you have chance to look at the problem?
Hi could you please provide me with the trained model weights ?
It looks like this model doesn't provide the expected result.
Hi @getmeIns , Sorry i am not able to access my GPU to look into the problem . But i am sure , that the model converges and gives the results expected. I suggest you start the project from scratch , train it for one image . And also as you told me that the loss was converging , mostly i feel there is an issue with the way prediction is made in your case . If you feel there is something wrong you can fork the project , make the required changes and send me a pull request . Meanwhile if get the access i will make sure that i will solve your issue . But as i said it may be a minor inconsistency while prediction. Sorry for inconvenience caused. You can mail me on rjusnba@gmail.com , i will personally help you out in making this model work on your system .
@bashar-tom No , Due to the unavailability of a full time GPU , i trained the model only on a subset of data where it gave me good results.
Raj, thank you so much for the help and willingness to solve the problem. It looks like I had a problem with labels format. Keep you doing a great work!
Hi @DmitryGeyzersky , As you have closed the issue, I assume that you were able to successfully train the model. I will be glad if you can help me with the dataset and labels part. I also want to train the model on PASCAL VOC 2012 as mentioned in the paper. I have downloaded the dataset from http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#data
Can you please share the steps/code for generating the labels from it ?
Thanks
Hi @guptasonam1602 , Sorry for the late response. I was able to successfully train the model on PASCAL VOC 2012. The author excerpted the code from python notebook; therefore, it is not complete. The most important step is preparing the labels properly. If you have downloaded the PASCAL VOC 2012 dataset, then you are supposed to have a folder called VOC2012/SegmentationObject. I have used the following code to create the labels:
import os
import cv2
import numpy as np
path_to_labels = 'VOC2012/SegmentationObject/'
allowed_extensions = [".jpg", ".jpeg", ".png"]
labels = os.listdir(path_to_labels)
kernel = np.ones((1, 2), np.uint8)
for file in labels:
if file.lower().endswith(tuple(allowed_extensions)):
label = cv2.imread(path_to_labels + file, 0)
ret, thresh = cv2.threshold(label, 0, 255, cv2.THRESH_BINARY)
thresh = cv2.dilate(thresh, kernel, iterations=1)
cv2.imwrite(path_to_labels + 'labels/' + file, thresh)
After creating the labels you may run the training. Please note the training process may take time depending on your hardware, particularly GPU card. After several hours of training, you should see the model starting to converge. I hope this helps.
Hello !
I'm a newbie to this and I'm trying to train the model on PASCAL VOC 2012 too. But I don't understand why I don't have any labels written when I run your code to create them @DmitryGeyzersky . There is no error but it's not creating any new files.
I also tried to use the create_labels.py
but my result is quite poor (cf image).
I know it's late but maybe someone still has the answer, who knows :p
Hi @Skillozone , Please use the following code to generate correct labels from the original PASCAL VOC dataset:
import os
import cv2
path_to_labels = 'VOC2012\SegmentationObject'
path_to_new_labels = 'path_for_new_labels'
allowed_extensions = [".png"]
labels = os.listdir(path_to_labels)
for file in labels:
if file.lower().endswith(tuple(allowed_extensions)):
label = cv2.imread(os.path.join(path_to_labels, file), 0)
label[label != 220] = 0
label[label == 220] = 255
cv2.imwrite(os.path.join(path_to_new_labels, file), label)
print('Labels created')
I hope this helps.
Hi @DmitryGeyzersky ! thanks a lot for the code for labels creation, it works well now !
But I am still unsatisfied with the final result after training with 100 Epoch on 150 images of PASCAL VOC dataset (I'm working on Google Colab, I can't train during during too much time). Most of my results in eval.py look like this one : the original image :
I feel like everything is not bad but its still far away from a nice contour detection. I hope someone could help me with this!
Cheers
Hi @Skillozone, It looks like everything is OK but you haven't trained enough. In my case, it took about two days for the loss function to settle and produce decent results (on GTX1080Ti GPU card). I guess you just need some more training time and patience and everything will be OK.
Regards,
Hi @DmitryGeyzersky , I tried to train more in order to see if the result gets better (~10 hours on Tesla P100-PCIE-16GB on google Colab) but in the end I have a black image, as you had earlier, with these values for P : [[[[2.85032392e-03] [4.86105680e-04] [1.07944012e-04] ... [8.22544098e-05] [3.39776278e-04] [1.27315521e-03]]
[[6.93380833e-04] [6.85155392e-05] [5.81145287e-06] ... [7.15255737e-06] [3.35276127e-05] [3.83198261e-04]]
[[1.01059675e-04] [6.64591789e-06] [4.17232513e-07] ... [5.96046448e-07] [5.39422035e-06] [1.01357698e-04]]
...
[[8.13901424e-05] [4.85777855e-06] [2.98023224e-07] ... [3.87430191e-07] [3.21865082e-06] [7.26282597e-05]]
[[7.11768866e-04] [8.01682472e-05] [8.40425491e-06] ... [9.38773155e-06] [4.95910645e-05] [5.04612923e-04]]
Any idea on what issue it could be ? You mentioned labels format in your case but what does it mean ?
Kind regards
I still think you might need more training, but I suggest that @Raj-08 who originally posted this code, weigh in and help you with troubleshooting as I'm currently busy with other projects. @Raj-08 , can you please review this thread and help @Skillozone in troubleshooting?
I'm adding some more informations to my issue if someone has time to help me :
Here I trained with more than 70 000 steps and reached less than 0,6 total_loss and I had a black image in eval but when I first trained with less steps, at least I had a result with much higher values for P (see the pictures with the table upper), even if it was pretty bad.
I don't understand how it can be less efficient when I choose to train more.
Sure @DmitryGeyzersky
@Skillozone Hi, First of all, it would be helpful to visualize the labels. You need to see what values is contained on the boundaries. You can visualize by
label=cv2.imread('your label location',0) %matplotlib notebook plt.imshow(label) plt.show()
Then you need to hover your pointer over the contours to check the values. Kindly tell me what value is that.
Hi @Raj-08 ! Thanks a lot for helping me, really appreciate.
It seems like I can't use %matplotlib notebook in Google Colab but at least I can show the Labels and print the min and max values of labels as you asked Dmitry in this forum earlier.
(We get these colors and not black and white because we use matplotlib to show image I think)
Hi ! A little update on what I've tried since my last post :
I managed to get this result by changing the learning rate from 0.0000001 to 0.000001 and training during ~10h. I'm not really sure to understand what the learning_rate exactly changes in the execution but here is the result :
It's very encouraging since the countours are now recognizable ! I'm going to re-train one more time with a x10 learning rate once again to see what I'll get. And after that I will try to run it on my personnal GPU instead of Colab in order to have a longer training-time.
Kudos! You are on the right path. You just need more training. In my case, it took me ~48 hours to get decent results.
First of all, thank you for sharing the code and the great work. After some minor syntax changes to train.py I was able to run the training (I'm running tensorflow 1.10.0) As I guess, the input_image in eval.py should be reshaped to [1,?,?,3] prior to feeding into the network. The result of prediction (P) in eval.py is array of shape [1,480,480,3]. As I understand, the resulting mask with contour should be returned in P[0], but cv2.imshow('name', P[0]) shows black image. I wonder how you interpret the results and correctly show predicted contours. If you can share the code you used to show the final results, it would be greatly appreciated.