ifnspaml / SGDepth

[ECCV 2020] Self-Supervised Monocular Depth Estimation: Solving the Dynamic Object Problem by Semantic Guidance
MIT License
200 stars 26 forks source link

New segmentation classes #8

Closed Ale0311 closed 1 year ago

Ale0311 commented 3 years ago

Hello!

I was wondering if there is any way I could change the classes used for segmentation. I would like to use your model for some indoor depth prediction, where there are no bikes, cars, signs, etc.

Thank you! Looking forward to your response!

klingner commented 3 years ago

Hello!

For this purpose you should probably take a look at the file loaders/segmentation/train.py. Here, the different possible loaders for semantic segmentation are defined (in this case only cityscapes, but we have also tried out BDD, Mapillary, GTA and more). Basically you need to define your own loader function as an alternative to cityscapes_train(resize_height, resize_width, crop_height, crop_width, batch_size, num_workers), which takes the same arguments. For this I think you have two options:

  1. You can take a look at our dataloader repository and find out how to integrate your dataset into our dataloader https://github.com/ifnspaml/IFN_Dataloader (maybe not the fastest way, but very efficient later on, if you intend to use a lot of different datasets such as BDD, mapillary, GTA, SYNTHIA and more).
  2. Another option (a little bit easier maybe) would be to just define your custom loader function and make sure it returns a loader returning data of the same format as our dataloader (mainly the color image and the segmentation mask in trainid format).

The information about the number of classes is also inserted into the loader function in line 30 tf.AddKeyValue('num_classes', num_classes). Somehow this information is not yet used to initialize the model, but is hard-coded in models/networks/multi_res_output.py in line 53. The fastest way is probably to just adapt this to your number of classes. Apparently, we did not really think about changing the number of classes until now, but it would be a nice addition to the code.

I hope this gives a rough scetch on how you could proceed! If you make the number of classes flexible, I would like to consider inserting this into the code as a useful addition.

Ale0311 commented 3 years ago

Hello!

Thank you very much for your response and explanation. I'll try using your advice and if/when I have a working version I'll be more than happy to let you know. Thanks again! 😊

Ale0311 commented 3 years ago

Hello!

Can you please tell me what the basic_files.json is and how do I generate it for the PASCAL dataset? I believe this is used for the training of the segmentation model, right?

Thank you!

klingner commented 3 years ago

Hi, the basic_files.json is used to store information which color image belongs to which segmentation mask. Information on how to generate this file is available in this repository: https://github.com/ifnspaml/IFN_Dataloader/tree/master/dataloader/file_io.

The segmentation model, however, is not trained on the PASCAL dataset but on the Cityscapes dataset. The necessary basic_files.json is downloadable directly in the repository. If you would like to use the PASCAL dataset, it is possible to generate it from the description in the link above. If you write me a mail, I can also send you my precreated file for this dataset. I would, however expect some issues when using this model for segmentation training as the PASCAL dataset does not contain solely street scenes, so there might be domain shift issues compared to the street scenes from the KITTI dataset.