alexgkendall / SegNet-Tutorial

Files for a tutorial to train SegNet for road scenes using the CamVid dataset
http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html
851 stars 518 forks source link

Is the CamVid training data provided with the repo wrong? #51

Open 5argon opened 8 years ago

5argon commented 8 years ago

In test_segmentation_camvid.py

 Sky = [128,128,128]
 Building = [128,0,0]
 Pole = [192,192,128]
 Road_marking = [255,69,0]
 Road = [128,64,128]
 Pavement = [60,40,222]
 Tree = [128,128,0]
 SignSymbol = [192,128,128]
 Fence = [64,64,128]
 Car = [64,0,128]
 Pedestrian = [64,64,0]
 Bicyclist = [0,128,192]
 Unlabelled = [0,0,0]

 label_colours = np.array([Sky, Building, Pole, Road, Pavement, Tree, SignSymbol, Fence, Car, Pedestrian, Bicyclist, Unlabelled])

You did not use "Road_marking" in the benchmark. And I notice that Road_marking in the test data have been all removed too. Is there any reason for this? And so network trained with these data can never segment Road_marking?

If Road_marking have been removed from the training data, and you said Driving Web Demo have been trained with the same method, why the Driving Web Demo can segment Road_marking? (Bright orange color) https://github.com/alexgkendall/SegNet-Tutorial/blob/master/Example_Models/segnet_model_zoo.md

I also noticed Scripts/camvid12.png and Scripts/camvid11.png. What is the meaning of both? I noticed that the Scripts/camvid11.png has no bright orange (Road_marking) and it is used in bayesian version of Segnet in the tutorial. Please clarify this to me why.

Also I noticed "Unknown" (black color) in the array. In test data, the area that should be unknown has the value 14. Why 14? Isn't the total number of output (according to segnet_train.prototxt) 11? Why the number of label far exceed 11?

And also, in the README.md it said use Scripts/camvid12.png with egnet_weights_driving_webdemo.caffemodel. That means egnet_weights_driving_webdemo.caffemodel can segment Road_marking. Is that means if I trained from scratch following http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html I won't end up with the same capability since I am lacking Road_marking?

One more doubts, in the data I saw the area with "Bike" labelled as 12, and area with "Tree" labeled as 4. From the test array above, Bike should be 10 and Tree should be 5. Is the training data wrong?

alexgkendall commented 8 years ago

Hey @5argon ,

The research benchmark we used ommitted the road_marking class, so we merged it with road for these 11 class models. For the webdemo model, we included road_marking.

Scripts/camvid12.png is for models with road marking and Scripts/camvid11.png is for those without

Cheers, Alex

Jinming-Su commented 7 years ago

I encounter this problem today.And I think I have solved this problem.The explanation is below:

Sky = [128,128,128]
    Building = [128,0,0]
    Pole = [192,192,128]
    Road_marking = [255,69,0]
    Road = [128,64,128]
    Pavement = [60,40,222]
    Tree = [128,128,0]
    SignSymbol = [192,128,128]
    Fence = [64,64,128]
    Car = [64,0,128]
    Pedestrian = [64,64,0]
    Bicyclist = [0,128,192]
    Unlabelled = [0,0,0]

    # for 11 classes(ground truth)
    label_colours1 = np.array([Sky, Building, Pole, Road, Pavement, 
                              Tree, SignSymbol, Fence, Car, Pedestrian, 
                              Bicyclist, Unlabelled])

    # for 12 classes(prediction)
    label_colours2 = np.array([Sky, Building, Pole, Road_marking, Road, Pavement, 
                              Tree, SignSymbol, Fence, Car, Pedestrian, 
                              Bicyclist, Unlabelled])

I think anyone can understand this.