ifnspaml / SGDepth

[ECCV 2020] Self-Supervised Monocular Depth Estimation: Solving the Dynamic Object Problem by Semantic Guidance
MIT License
200 stars 26 forks source link

Semantic Labels for The Training #25

Open csBob123 opened 2 years ago

csBob123 commented 2 years ago

Could I know whether you use the KITTI semantic labels for the training? I guess you only use the semantic labels on Cityscapes.

Thank you for your work.

klingner commented 2 years ago

Hi, I did not use the KITTI semantic labels for training but only the ones from Cityscapes.

csBob123 commented 2 years ago

Hi, I did not use the KITTI semantic labels for training but only the ones from Cityscapes.

Hi, Thank you for your reply. The number of classes on Cityscape is 19, but I found the final output of 'segdecoder' has 20 classes(20 channels).

Could I know the reason and its impacts?

Thank you for your attention.

klingner commented 2 years ago

Hi,

the 20th class is the background class. It is essentially not trained, as in the wieghted cross-entropy loss its weight is set to 0. For the evaluation only the 19 classes of the Cityscapes dataset are considered. In theory it should not really matter, if you only train 19 classes or 19 classes + 1 background class, if the weight of the background class in the loss is set to 0.

bansilol commented 2 years ago

Hi @klingner I had a small query about which part of Cityscapes Dataset to download. Up to my understanding we have to use the gtFine_trainvaltest.zip. But it has 30 classes and @csBob123 said the model is trained on 20. Is that okay or do I need to change anything in the code in order to do the segmentation training?

Thanks in Advance :)

klingner commented 2 years ago

Hi, the Cityscapes dataset has 30 classes, but only 19 of them are used for training and the other classes are set to background when training. Effectively, I always use the 19 classes as defined by Cityscapes.

bansilol commented 2 years ago

Okay. Thanks for clearing up the query. :)