MarvinTeichmann / KittiSeg

A Kitti Road Segmentation model implemented in tensorflow.
MIT License
910 stars 403 forks source link

About multiclass (non-binary) segmentation #15

Open ngbrenda opened 7 years ago

ngbrenda commented 7 years ago

Thanks so much for making your code available. Anyway, I would like to modify your code so that, assuming there can only be N possible classes, the code would be able to detect and identify the K<=N classes that are in the scene and segment them (by different colors) accordingly.

I read through your doc on how to train your own data, and I'm afraid I need more guidance.

Can you please clarify the steps needed to achieve this? Thanks so much in advance.

MarvinTeichmann commented 7 years ago

Yes, the function _make_data_gen returns and image and a ground_truth tensor. The image tensor is supposed to have shape [width, height, 3], representing the RGB value of the image. Your ground_truth tensor needs shape [width, height, N], where gt_image[i,j,k] == 1 if and only if pixel i,j corresponds to class 'k'. gt_image[i,j,k] == 0 otherwise.

You will also need to write your own evaluation code. I would start with a simple per pixel accuracy to start with. This shows you whether the training is going well and you can implement more complex metrics later on. Hope this helps!

MarvinTeichmann commented 7 years ago

As mentioned in #29 the loss also needs a minor change. Num classes are hard-coded to two. For multiclass segmentation you need to change the 2 in line 65, 66 and 69 to "num_classes".

obendidi commented 7 years ago

Thanks is it possible to modify your evaluation code to fit more classes instead of writing our own completely ?

MarvinTeichmann commented 7 years ago

Yes it is.

Btw. if you run this evaluation code with a multi-class problem, the code will work and it will compute scores for the segmentation problem "Class 1 vs. all other classes". The reason is the following line 107. The call output[0][:, 1].reshape(shape[0], shape[1]) will only read the confidences of class 1 and discard all other classes. I you want to get the confidences for class k you can call output[0][:, k].reshape(shape[0], shape[1]).

Maybe it is a cood idea to start just train the model as is and play around with the output[0][:, k]. You could for example save those array as image and inspect it.

obendidi commented 7 years ago

Thanks for the answer , I'll see what I can do with line 107

just to mention , there is a change that need to be done in the lines from 26 to 32 , from what I understood , if you change the output in line 107 , you need to reference it to the correct color in the 'eval_image' function ?

obendidi commented 7 years ago

it seems that I'm doing something wrong , maybe I messed up the input function, I get this error when I try to train the model : File "/home/workspace/KittiSeg/hypes/../decoder/kitti_multiloss.py", line 74, in loss cross_entropy_mean = _compute_cross_entropy_mean(hypes, labels,softmax) File "/home/workspace/KittiSeg/hypes/../decoder/kitti_multiloss.py", line 99, in _compute_cross_entropy_mean cross_entropy = -tf.reduce_sum(tf.multiply(labels * tf.log(softmax), head), File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/math_ops.py", line 267, in multiply return gen_math_ops._mul(x, y, name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 1625, in _mul result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2329, in create_op set_shapes_for_outputs(ret) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1717, in set_shapes_for_outputs shapes = shape_func(op) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1667, in call_with_requiring return call_cpp_shape_fn(op, require_shape_fn=True) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn debug_python_shape_fn, require_shape_fn) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/common_shapes.py", line 676, in _call_cpp_shape_fn_impl raise ValueError(err.message) ValueError: Dimensions must be equal, but are 5 and 2 for 'Loss/loss/Mul' (op: 'Mul') with input shapes: [?,5], [2]. thanks for the help

faj2007 commented 7 years ago

did you check the hypes['arch']['weight'] is 5 elements(weights)? make sure your gt_images also have 5 channesl.

MarvinTeichmann commented 7 years ago

Yes, to me it looks like ether you labels or you softmax or head does not have 5 channels. I would recommend printing the shape of all three of those values.

It is most likely the head which is configured in hypes['arch']['weight'].

obendidi commented 7 years ago

thanks , it worked out, I had the head wrong There one other change that needs to be done , in the architecture script 'fcn8_vgg.py' the num_classes is hard coded to 2 too , line 38
also the same in 'kitti_multiloss.py' , line 151,152

about the evaluation script I'll repeat my Remarque/question : there is a change that need to be done in the lines from 26 to 32 , from what I understood , if you change the output in line 107 , you need to reference it to the correct color in the 'eval_image' function ? Thanks again :)

MarvinTeichmann commented 7 years ago

@bendidi yes, it looks like a change need to be done there.

If someone has a multiclass example working, I wouldn't mind a pool request. This would help further visitors in the feature ;).

bhack commented 7 years ago

A pull request could be very appreciated. Two dataset like coco stuff and ade20k could be the right target to test a multiclassification approach.

shivam-kotwalia commented 7 years ago

Hi @MarvinTeichmann I tried your fabulous KittiSeg for Multi Class on Pascal VOC Dataset. And it works like charm. https://github.com/shivam-kotwalia/KittiSeg/ Please have a look. :) Would love if anyone would like to evaluate and use.

MarvinTeichmann commented 7 years ago

Looks good. Which files did you need to adapt? Could you try to merge the code base and open a pull request? I think the easiest way would be to create a seperate hypes file PascalSeg.json and a second input producer pascal_voc_input.py (if required).

Did you also did much modification to the eval code?

shivam-kotwalia commented 7 years ago

Definitely I would open a pull request with separate JSON file. Currently I have disabled the "eval" code as I am stuck here, I also want to plugin the Mean IoU and IoU evaluation.

MarvinTeichmann commented 7 years ago

Have you seen this comment on line 107? Maybe it is actually best to write a different eval code from scratch. The reason why this eval is so complicated is, that I wanted to use the original Kitti evaluation code.

shivam-kotwalia commented 7 years ago

Yes, I totally agree that is was meant for Kitti Road data, and works the best for it. Thanks for the help Marvin :) But FYI the link you provided say Page not found.

MarvinTeichmann commented 7 years ago

But FYI the link you provided say Page not found.

Thanks. I fixed it.

leica8244 commented 7 years ago

@shivam-kotwalia Thank you for your version of multi-class. If you disabled the evaluation code, then how do you get the segmentation result? Just use the demo.py and load the parameters trained for multi-class?

obendidi commented 7 years ago

for the evaluation code , I tried something for multiclass evaluation , it might be crude , but that's what I got working :

I changed a function in this script

def eval_image(hypes, gt_image, cnn_image):
    """."""
    thresh = np.array(range(0, 256))/255.0
    FN,FP = np.zeros(thresh.shape),np.zeros(thresh.shape)
    posNum, negNum = 0,0
    c0 = np.array(hypes['data']['c0'])
    c1 = np.array(hypes['data']['c1'])
    c2 = np.array(hypes['data']['c2'])
    c3 = np.array(hypes['data']['c3'])
    gt_c0 = np.all(gt_image == c0, axis=2)
    gt_c1 = np.all(gt_image == c1, axis=2)
    gt_c2 = np.all(gt_image == c2, axis=2)
    gt_c3 = np.all(gt_image == c3, axis=2)
    valid_gt = gt_c0+gt_c1+gt_c2+gt_c3
    colors = [gt_c1,gt_c2,gt_c3]
    for i in range(len(colors)) :
        N, P, pos, neg = seg.evalExp(hypes,colors[i], cnn_image[:,:,i+1],
                                             thresh, validMap=None,
                                             validArea=valid_gt)
        FN = np.add(FN,N)
        FP = np.add(FP,P)
        posNum+=pos
        negNum+=neg

    return FN, FP, posNum, negNum

and this on this script

def evaluation(hypes, images, labels, decoded_logits, losses, global_step):
    """Evaluate the quality of the logits at predicting the label.
    Args:
      logits: Logits tensor, float - [batch_size, NUM_CLASSES].
      labels: Labels tensor, int32 - [batch_size], with values in the
        range [0, NUM_CLASSES).
    Returns:
      A scalar int32 tensor with the number of examples (out of batch_size)
      that were predicted correctly.
    """
    # For a classifier model, we can use the in_top_k Op.
    # It returns a bool tensor with shape [batch_size] that is true for
    # the examples where the label's is was in the top k (here k=1)
    # of all logits for that example.
    eval_list = []
    num_classes = hypes['arch']['num_classes']
    logits = tf.reshape(decoded_logits['logits'], (-1, num_classes))
    labels = tf.reshape(labels, (-1, num_classes))
    pred = tf.argmax(logits, dimension=1)
    y = tf.argmax(labels, 1)
    Prec = []
    Rec = []
    f1 = []
    for i in range(num_classes):
        tp = tf.count_nonzero(tf.cast(tf.equal(pred,i),tf.int32) * tf.cast(tf.equal(y,i),tf.int32))
        tn = tf.count_nonzero(tf.cast(tf.not_equal(pred,i),tf.int32) * tf.cast(tf.not_equal(y,i),tf.int32))
        fp = tf.count_nonzero(tf.cast(tf.equal(pred,i),tf.int32) * tf.cast(tf.not_equal(y,i),tf.int32))
        fn = tf.count_nonzero(tf.cast(tf.not_equal(pred,i),tf.int32) * tf.cast(tf.equal(pred,i),tf.int32))
        Prec.append(tp / (tp + fp))
        Rec.append(tp / (tp + fn))
        f1.append((2 * Prec[-1] * Rec[-1]) / (Prec[-1] + Rec[-1]))

    accuracy = tf.reduce_mean(tf.cast(tf.equal(y, pred), tf.float32))

    tf.summary.scalar("Accuracy", accuracy)
    tf.summary.scalar("SoftIU", losses['SoftIU'])
    tf.summary.scalar("c1_Precision", Prec[1])
    tf.summary.scalar("c1_Recall", Rec[1])
    tf.summary.scalar("c1_F1_Score", f1[1])
    tf.summary.scalar("c2_Precision", Prec[2])
    tf.summary.scalar("c2_Recall", Rec[2])
    tf.summary.scalar("c2_F1_Score", f1[2])
    tf.summary.scalar("c3_Precision", Prec[3])
    tf.summary.scalar("c3_Recall", Rec[3])
    tf.summary.scalar("c3_F1_Score", f1[3])

   eval_list.append(('Acc. ', accuracy))
    eval_list.append(('xentropy', losses['xentropy']))
    eval_list.append(('SoftIU', losses['SoftIU']))
    eval_list.append(('weight_loss', losses['weight_loss']))
    Prec = tf.convert_to_tensor(Prec)
    Rec = tf.convert_to_tensor(Rec)
    f1 = tf.convert_to_tensor(f1)
    eval_list.append(('Overall Precision ', tf.reduce_mean(Prec)))
    eval_list.append(('Overall Recall', tf.reduce_mean(Rec)))
    eval_list.append(('Overall F1 score ', tf.reduce_mean(f1)))

    return eval_list
StuvX commented 7 years ago

Hi Bendidi - for your script edit it seems to me that the combined training ground-truth (valid_gt) will be sent to the evaluator, and so the network will learn to identify c1 | c2 | c3 vs. c0? Have you had success in the training using your method?

obendidi commented 7 years ago

@StuvX this is just a script for evaluation , it can not influence training, it's just used to check how the trained model performs on validation set in the different steps of the training, the first script that I changed will not help much in visualisation , as I just changed it the lazy way so that it doesn't crash, but the second script is where I calculate the overall accuracy and recall and precision of each class which is pretty helpfull for evaluation

psuff commented 6 years ago

has anyone succeeded in running those custom multiclass KittiSeg? I get a bunch of errors @shivam-kotwalia @bendidi

obendidi commented 6 years ago

@psuff what kind of errors do you get ?

psuff commented 6 years ago

@bendidi could you explained how to make the evaluation script to work properly? I have tried modifying kitti_eval.py and fcn.py from @shivam-kotwalia 's custom KittiSeg following your suggestions but i can't do the evaluation. Is there anything else to modify? Can you provide your configuration?

obendidi commented 6 years ago

@psuff check my fork , I put my code in there , I didn't test it yet , I'll do it some time later :

https://github.com/bendidi/KittiSeg

ywangeq commented 6 years ago

@bendidi Sorry, it didn't work, many problems happened. I try to fix one by one.

ywangeq commented 6 years ago

@bendidi,first,in python, the transformation is all based on reference,so you can not directly modify the color dictionary, you need to new one first. And the output from sess is a list, not a array,which you can not reshape directly. I also not very understand about your paint because it is quite trouble for me to get each output of color,overlay img in terminal, so if u have time, please give a response what’s the meaning of your painting function thank!

ywangeq commented 6 years ago

@bendidi,first,in python, the transformation is all based on reference,so you can not directly modify the color dictionary, you need to new one first. And the output from sess is a list, not a array,which you can not reshape directly. I also not very understand about your paint because it is quite

obendidi commented 6 years ago

@ywangeq The code I'm working with (and tested) is heavily modified from kittiseg, and I can't really share it , the fork that I made contain just the changes necessary for a basic multi-class segmentation to work (kind of a pseudo-code ) didn't test it and don't know if there is any mistake, (I'm planning to test it and correct it when I have the time .. and a free GPU ) about the paint function , it uses a dictionary to map each class of the segmented image to an RGB color , in short : after doing an argmax to teh output of the network you get an image with class numbers instead of RGB pixel arrays , the paint function maps each of these classes to an RGB color defined in a python dict like so : { 1 : np.array([255,0,0]), 0: np.array([0,0,0]), 2: np.array([0,255,0]) }

rodolfolotte commented 6 years ago

@bendidi, complementing your last comment. I downloaded, trained and test your fork and it did not work.

Also, the script described in your comment for evaluation, was incorrect due the range of cnn_image vector:

 N, P, pos, neg = seg.evalExp(hypes,colors[i], cnn_image[:,:,i+1],
IndexError: too many indices for array

The other one was about SoftUI measuring. The lines tf.summary.scalar("SoftIU", losses['SoftIU']) and eval_list.append(('SoftIU', losses['SoftIU'])) has to be removed.

Im still investigating what could possibly be wrong. Recently, i was looking in tensorboard (on images tab) and noticed that only one class was using. Now, Im trying to figure if there is something wrong with my input files.

Thank you!

obendidi commented 6 years ago

@rodolfolotte , the scripts I've made is full of bugs , and it's normal to not work , I'll be fixing it shortly , if you want you could share with me your dataset with 8 classes to test on that and try to check why it only detects 3 classes

rodolfolotte commented 6 years ago

Hi @bendidi!! Yes yes, sure! No worries! I'm sending you the dataset, would be great to have a feedback from another version of multi-class!

kshitijagrwl commented 6 years ago

I've got the code working for the cityscapes dataset with 19 classes, thanks to the modifications provided by bendidi. However facing some issues with eval - I need to move that to mIoU metric. My question is that for training will the cross entropy loss be good enough?

StuvX commented 6 years ago

Eval using mIoU or micro average-F-score should be OK. For training I set it to sparse cross entropy using the tf function instead of Marvin's implementation, it seemed to work.

kshitijagrwl commented 6 years ago

@StuvX so you train each label individually using sparse_cross_entropy_with_logits? I wanted to do it all together using softmax_cross_entropy_with_logits , however it just explodes and becomes huge in 500-700 steps

StuvX commented 6 years ago

I only have four labels, including background, so that helps.

In kitti_multiloss.py I changed the _compute_cross_entropy_mean def to use tf.nn.sparse_softmax_cross_entropy_with_logits

Honestly I can't recall if it's looping through the classes and formulating a loss for each, or if it's doing it all at once, I think it's doing it all at once, but I'll have to go back through and check the code.