Open ngbrenda opened 7 years ago
Yes, the function _make_data_gen
returns and image and a ground_truth tensor. The image tensor is supposed to have shape [width, height, 3], representing the RGB value of the image. Your ground_truth tensor needs shape [width, height, N], where gt_image[i,j,k] == 1
if and only if pixel i,j
corresponds to class 'k'. gt_image[i,j,k] == 0
otherwise.
You will also need to write your own evaluation code. I would start with a simple per pixel accuracy
to start with. This shows you whether the training is going well and you can implement more complex metrics later on. Hope this helps!
Thanks is it possible to modify your evaluation code to fit more classes instead of writing our own completely ?
Yes it is.
Btw. if you run this evaluation code with a multi-class problem, the code will work and it will compute scores for the segmentation problem "Class 1 vs. all other classes". The reason is the following line 107. The call output[0][:, 1].reshape(shape[0], shape[1])
will only read the confidences of class 1
and discard all other classes. I you want to get the confidences for class k
you can call
output[0][:, k].reshape(shape[0], shape[1])
.
Maybe it is a cood idea to start just train the model as is and play around with the output[0][:, k]
. You could for example save those array as image and inspect it.
Thanks for the answer , I'll see what I can do with line 107
just to mention , there is a change that need to be done in the lines from 26 to 32 , from what I understood , if you change the output in line 107 , you need to reference it to the correct color in the 'eval_image' function ?
it seems that I'm doing something wrong , maybe I messed up the input function, I get this error when I try to train the model :
File "/home/workspace/KittiSeg/hypes/../decoder/kitti_multiloss.py", line 74, in loss cross_entropy_mean = _compute_cross_entropy_mean(hypes, labels,softmax) File "/home/workspace/KittiSeg/hypes/../decoder/kitti_multiloss.py", line 99, in _compute_cross_entropy_mean cross_entropy = -tf.reduce_sum(tf.multiply(labels * tf.log(softmax), head), File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/math_ops.py", line 267, in multiply return gen_math_ops._mul(x, y, name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 1625, in _mul result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2329, in create_op set_shapes_for_outputs(ret) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1717, in set_shapes_for_outputs shapes = shape_func(op) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1667, in call_with_requiring return call_cpp_shape_fn(op, require_shape_fn=True) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn debug_python_shape_fn, require_shape_fn) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/common_shapes.py", line 676, in _call_cpp_shape_fn_impl raise ValueError(err.message) ValueError: Dimensions must be equal, but are 5 and 2 for 'Loss/loss/Mul' (op: 'Mul') with input shapes: [?,5], [2].
thanks for the help
did you check the hypes['arch']['weight'] is 5 elements(weights)? make sure your gt_images also have 5 channesl.
Yes, to me it looks like ether you labels
or you softmax
or head
does not have 5 channels. I would recommend printing the shape of all three of those values.
It is most likely the head
which is configured in hypes['arch']['weight']
.
thanks , it worked out, I had the head wrong
There one other change that needs to be done , in the architecture script 'fcn8_vgg.py' the num_classes is hard coded to 2 too , line 38
also the same in 'kitti_multiloss.py' , line 151,152
about the evaluation script I'll repeat my Remarque/question : there is a change that need to be done in the lines from 26 to 32 , from what I understood , if you change the output in line 107 , you need to reference it to the correct color in the 'eval_image' function ? Thanks again :)
@bendidi yes, it looks like a change need to be done there.
If someone has a multiclass example working, I wouldn't mind a pool request. This would help further visitors in the feature ;).
A pull request could be very appreciated. Two dataset like coco stuff and ade20k could be the right target to test a multiclassification approach.
Hi @MarvinTeichmann I tried your fabulous KittiSeg for Multi Class on Pascal VOC Dataset. And it works like charm. https://github.com/shivam-kotwalia/KittiSeg/ Please have a look. :) Would love if anyone would like to evaluate and use.
Looks good. Which files did you need to adapt? Could you try to merge the code base and open a pull request? I think the easiest way would be to create a seperate hypes file PascalSeg.json
and a second input producer pascal_voc_input.py
(if required).
Did you also did much modification to the eval code?
Definitely I would open a pull request with separate JSON file. Currently I have disabled the "eval" code as I am stuck here, I also want to plugin the Mean IoU and IoU evaluation.
Have you seen this comment on line 107? Maybe it is actually best to write a different eval code from scratch. The reason why this eval is so complicated is, that I wanted to use the original Kitti evaluation code.
Yes, I totally agree that is was meant for Kitti Road data, and works the best for it. Thanks for the help Marvin :) But FYI the link you provided say Page not found.
But FYI the link you provided say Page not found.
Thanks. I fixed it.
@shivam-kotwalia Thank you for your version of multi-class. If you disabled the evaluation code, then how do you get the segmentation result? Just use the demo.py and load the parameters trained for multi-class?
for the evaluation code , I tried something for multiclass evaluation , it might be crude , but that's what I got working :
I changed a function in this script
def eval_image(hypes, gt_image, cnn_image):
"""."""
thresh = np.array(range(0, 256))/255.0
FN,FP = np.zeros(thresh.shape),np.zeros(thresh.shape)
posNum, negNum = 0,0
c0 = np.array(hypes['data']['c0'])
c1 = np.array(hypes['data']['c1'])
c2 = np.array(hypes['data']['c2'])
c3 = np.array(hypes['data']['c3'])
gt_c0 = np.all(gt_image == c0, axis=2)
gt_c1 = np.all(gt_image == c1, axis=2)
gt_c2 = np.all(gt_image == c2, axis=2)
gt_c3 = np.all(gt_image == c3, axis=2)
valid_gt = gt_c0+gt_c1+gt_c2+gt_c3
colors = [gt_c1,gt_c2,gt_c3]
for i in range(len(colors)) :
N, P, pos, neg = seg.evalExp(hypes,colors[i], cnn_image[:,:,i+1],
thresh, validMap=None,
validArea=valid_gt)
FN = np.add(FN,N)
FP = np.add(FP,P)
posNum+=pos
negNum+=neg
return FN, FP, posNum, negNum
and this on this script
def evaluation(hypes, images, labels, decoded_logits, losses, global_step):
"""Evaluate the quality of the logits at predicting the label.
Args:
logits: Logits tensor, float - [batch_size, NUM_CLASSES].
labels: Labels tensor, int32 - [batch_size], with values in the
range [0, NUM_CLASSES).
Returns:
A scalar int32 tensor with the number of examples (out of batch_size)
that were predicted correctly.
"""
# For a classifier model, we can use the in_top_k Op.
# It returns a bool tensor with shape [batch_size] that is true for
# the examples where the label's is was in the top k (here k=1)
# of all logits for that example.
eval_list = []
num_classes = hypes['arch']['num_classes']
logits = tf.reshape(decoded_logits['logits'], (-1, num_classes))
labels = tf.reshape(labels, (-1, num_classes))
pred = tf.argmax(logits, dimension=1)
y = tf.argmax(labels, 1)
Prec = []
Rec = []
f1 = []
for i in range(num_classes):
tp = tf.count_nonzero(tf.cast(tf.equal(pred,i),tf.int32) * tf.cast(tf.equal(y,i),tf.int32))
tn = tf.count_nonzero(tf.cast(tf.not_equal(pred,i),tf.int32) * tf.cast(tf.not_equal(y,i),tf.int32))
fp = tf.count_nonzero(tf.cast(tf.equal(pred,i),tf.int32) * tf.cast(tf.not_equal(y,i),tf.int32))
fn = tf.count_nonzero(tf.cast(tf.not_equal(pred,i),tf.int32) * tf.cast(tf.equal(pred,i),tf.int32))
Prec.append(tp / (tp + fp))
Rec.append(tp / (tp + fn))
f1.append((2 * Prec[-1] * Rec[-1]) / (Prec[-1] + Rec[-1]))
accuracy = tf.reduce_mean(tf.cast(tf.equal(y, pred), tf.float32))
tf.summary.scalar("Accuracy", accuracy)
tf.summary.scalar("SoftIU", losses['SoftIU'])
tf.summary.scalar("c1_Precision", Prec[1])
tf.summary.scalar("c1_Recall", Rec[1])
tf.summary.scalar("c1_F1_Score", f1[1])
tf.summary.scalar("c2_Precision", Prec[2])
tf.summary.scalar("c2_Recall", Rec[2])
tf.summary.scalar("c2_F1_Score", f1[2])
tf.summary.scalar("c3_Precision", Prec[3])
tf.summary.scalar("c3_Recall", Rec[3])
tf.summary.scalar("c3_F1_Score", f1[3])
eval_list.append(('Acc. ', accuracy))
eval_list.append(('xentropy', losses['xentropy']))
eval_list.append(('SoftIU', losses['SoftIU']))
eval_list.append(('weight_loss', losses['weight_loss']))
Prec = tf.convert_to_tensor(Prec)
Rec = tf.convert_to_tensor(Rec)
f1 = tf.convert_to_tensor(f1)
eval_list.append(('Overall Precision ', tf.reduce_mean(Prec)))
eval_list.append(('Overall Recall', tf.reduce_mean(Rec)))
eval_list.append(('Overall F1 score ', tf.reduce_mean(f1)))
return eval_list
Hi Bendidi - for your script edit it seems to me that the combined training ground-truth (valid_gt) will be sent to the evaluator, and so the network will learn to identify c1 | c2 | c3 vs. c0? Have you had success in the training using your method?
@StuvX this is just a script for evaluation , it can not influence training, it's just used to check how the trained model performs on validation set in the different steps of the training, the first script that I changed will not help much in visualisation , as I just changed it the lazy way so that it doesn't crash, but the second script is where I calculate the overall accuracy and recall and precision of each class which is pretty helpfull for evaluation
has anyone succeeded in running those custom multiclass KittiSeg? I get a bunch of errors @shivam-kotwalia @bendidi
@psuff what kind of errors do you get ?
@bendidi could you explained how to make the evaluation script to work properly? I have tried modifying kitti_eval.py and fcn.py from @shivam-kotwalia 's custom KittiSeg following your suggestions but i can't do the evaluation. Is there anything else to modify? Can you provide your configuration?
@psuff check my fork , I put my code in there , I didn't test it yet , I'll do it some time later :
@bendidi Sorry, it didn't work, many problems happened. I try to fix one by one.
@bendidi,first,in python, the transformation is all based on reference,so you can not directly modify the color dictionary, you need to new one first. And the output from sess is a list, not a array,which you can not reshape directly. I also not very understand about your paint because it is quite trouble for me to get each output of color,overlay img in terminal, so if u have time, please give a response what’s the meaning of your painting function thank!
@bendidi,first,in python, the transformation is all based on reference,so you can not directly modify the color dictionary, you need to new one first. And the output from sess is a list, not a array,which you can not reshape directly. I also not very understand about your paint because it is quite
@ywangeq The code I'm working with (and tested) is heavily modified from kittiseg, and I can't really share it , the fork that I made contain just the changes necessary for a basic multi-class segmentation to work (kind of a pseudo-code ) didn't test it and don't know if there is any mistake, (I'm planning to test it and correct it when I have the time .. and a free GPU ) about the paint function , it uses a dictionary to map each class of the segmented image to an RGB color , in short : after doing an argmax
to teh output of the network you get an image with class numbers instead of RGB pixel arrays , the paint function maps each of these classes to an RGB color defined in a python dict like so :
{ 1 : np.array([255,0,0]), 0: np.array([0,0,0]), 2: np.array([0,255,0]) }
@bendidi, complementing your last comment. I downloaded, trained and test your fork and it did not work.
Also, the script described in your comment for evaluation, was incorrect due the range of cnn_image vector:
N, P, pos, neg = seg.evalExp(hypes,colors[i], cnn_image[:,:,i+1],
IndexError: too many indices for array
The other one was about SoftUI measuring. The lines tf.summary.scalar("SoftIU", losses['SoftIU'])
and eval_list.append(('SoftIU', losses['SoftIU']))
has to be removed.
Im still investigating what could possibly be wrong. Recently, i was looking in tensorboard (on images tab) and noticed that only one class was using. Now, Im trying to figure if there is something wrong with my input files.
Thank you!
@rodolfolotte , the scripts I've made is full of bugs , and it's normal to not work , I'll be fixing it shortly , if you want you could share with me your dataset with 8 classes to test on that and try to check why it only detects 3 classes
Hi @bendidi!! Yes yes, sure! No worries! I'm sending you the dataset, would be great to have a feedback from another version of multi-class!
I've got the code working for the cityscapes dataset with 19 classes, thanks to the modifications provided by bendidi. However facing some issues with eval - I need to move that to mIoU metric. My question is that for training will the cross entropy loss be good enough?
Eval using mIoU or micro average-F-score should be OK. For training I set it to sparse cross entropy using the tf function instead of Marvin's implementation, it seemed to work.
@StuvX so you train each label individually using sparse_cross_entropy_with_logits? I wanted to do it all together using softmax_cross_entropy_with_logits , however it just explodes and becomes huge in 500-700 steps
I only have four labels, including background, so that helps.
In kitti_multiloss.py I changed the _compute_cross_entropy_mean def to use tf.nn.sparse_softmax_cross_entropy_with_logits
Honestly I can't recall if it's looping through the classes and formulating a loss for each, or if it's doing it all at once, I think it's doing it all at once, but I'll have to go back through and check the code.
Thanks so much for making your code available. Anyway, I would like to modify your code so that, assuming there can only be N possible classes, the code would be able to detect and identify the K<=N classes that are in the scene and segment them (by different colors) accordingly.
I read through your doc on how to train your own data, and I'm afraid I need more guidance.
Can you please clarify the steps needed to achieve this? Thanks so much in advance.