Perhaps a mistake in class index? Semantic8, sema3d_dataset.py

leemengwei commented 5 years ago

Is it a mistake in class index? The Semantic8 dataset, is consist of {1: man-made terrain, 2: natural terrain, 3: high vegetation, 4: low vegetation, 5: buildings, 6: hard scape, 7: scanning artefacts, 8: cars} and an additional label {0: unlabeled points} as documented in it's website. While in learning/sema3d_dataset.py, line 66, I find that the class index turn out to be: {0:'terrain_man', 1:'terrain_nature', 2:'veget_hi', 3:'veget_low', 4:'building', 5:'scape', 6:'artefact', 7:'cars'}, all one index shift forward. Is't supposed to be like that? thankyou!

leemengwei commented 5 years ago

The later refers to the class index, so it's one less. Problems solved with sparse automatic lidar data. So I'm closing this issue. Further details if anyone may be interested with: My data set is of: classid = {"DontCare": 0, "cyclist": 1, "tricycle": 2, "smallMot":3, \ "bigMot": 4, "pedestrian": 5, "crowds": 6, "unknown": 7} Yet 'DontCare' class takes about 95% percent of whole data set. At beginning, I wrongly take the 'DontCare' class as background class in this SPG project, then almost all points are predicted to be 'smallMot' (the second dominant class) and the loss drops to zero, giving me a meaningless model (because the SPGnet don't treat background class with loss). Then I just make my 'DontCare' class to additional class, say class 8, such act would force the network to predict 'DontCare' class to be an actual meaningful class with loss counting on, and it works fine.

luoxiaoliaolan commented 5 years ago

@leemengwei Have you solved the mistake in class index? I also encountered the mistake. This led to the results that the 0th and 1st categories were inseparable. Just like this: _20181129205241 The color I set for the classes is ("ground": 0, "building": 1, "tree": 2, "pole":3). But I got the result showed that the mistake colors. _20181129205807 Do you know where should I change to solve this problem? Welcome to send an email to my email address. (lybluoxiao@qq.com)

loicland commented 5 years ago

@leemengwei sorry I seem to have missed your issue earlier.

There is indeed a discrepancy in how the classes are indexed in the training set and how they are predicted, I can see how that would be confusing.

When reading the data, the class 0 is reserved for unlabelled data. For inference, all superpoints will be associated with a semantic class, such that none will have the unlabelled class. So we have shifted the classes by one, so that the first real class (1 in the data) will be associated with 0.

This 0 class is not designed for a background class, but rather for small regions of annotated data, or when the annotation is on a subsampled version of the full graph. If you put all your background in the class 0 then at inference time it will see all those superpoints of unknown shape (unlabelled data are excluded from the supervision) and will try to classify them as the closest real class. Using an actual class for background is indeed the way to go.

loicland commented 5 years ago

@luoxiaoliaolan

hi,

your problem is that you put the class ground as 0 (i.e. unannotated), so it is ignored when supervising. Furthermore, this will shift all the classes by one, meaning that the subsequent classes are uncorrectly labeleld by 1.

What you should do is that the function you use to read your data should associate the following labels:

"ground": 1, "building": 2, "tree": 3, "pole":4

and it should work! Let me know if you still encounter problems.

luoxiaoliaolan commented 5 years ago

@loicland Thank you very much!! I would like to know if the modification you mentioned is to modify the category label in the original file(XXX.labels) or when the program reads the data. My dataset like this: _20181129223457 _20181129224315 _20181129224619_ 1111111

labels file: _20181129225205

loicland commented 5 years ago

You can either change your .labels files so that your first real classes is labelled with 1, or add 1 to the labels in the read_my_dataset function in provider.py if you coded your own (provided you dont have unlabelled data).

leemengwei commented 5 years ago

There is no mistake of the author’s code, it’s just a little bit misleading on the index, but it’s fine. My mistake is due to the my misunderstanding on the unlabeled class—as for my dataset, the background points are labeled as class ‘background’, not unlabeled, which has caused my later mistake of classifying all points to class one. Based on your description I don’t think you had mistake same as mine. It seems it’s just lack of training (Perhaps you could try training on only one clip of data until overfitting(loss—>0), and then test on that image to see if everything come out the way as you supposed(near to ground truth).

发自我的 iPhone

在 2018年11月29日，下午8:59，LiuYibo notifications@github.com 写道：

@leemengwei Have you solved the mistake in class index? I also encountered the mistake. This led to the results that the 0th and 1st categories were inseparable. Just like this:

The color I set for the classes is ("ground": 0, "building": 1, "tree": 2, "pole":3). But I got the result showed that the mistake colors.

Do you know where should I change to solve this problem? Welcome to send an email to my email address. (lybluoxiao@qq.com)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

luoxiaoliaolan commented 5 years ago

@leemengwei 感谢您的提议，可否加个联系方式一起讨论一下？我邮箱：lybluoxiao@qq.com

loicland / superpoint_graph

Perhaps a mistake in class index? Semantic8, sema3d_dataset.py #63