VITA-Group / FasterSeg

[ICLR 2020] "FasterSeg: Searching for Faster Real-time Semantic Segmentation" by Wuyang Chen, Xinyu Gong, Xianming Liu, Qian Zhang, Yuan Li, Zhangyang Wang
MIT License
524 stars 107 forks source link

Training with custom data with different resolution to cityscapes dataset #50

Closed DanielEliasib closed 3 years ago

DanielEliasib commented 3 years ago

Hi, i'm interested in training this with a custom dataset I created from scratch. So far I have tried following the steps in here, but I can't get pass the pretrain step.

The output i'm getting is:

image

The error doesn't tell me too much, just that it is inside model_search.py the momento the logits are calculated which makes me believe that the problem might be with the input, I'm wondering if the resolutions of the images matters at this point, my images are 576x640, but I have noticed that inside the config the crop size always keeps the aspect ratio of the images from cityscapes, could that be the problem? I also notices that inside train_search.py when the model is created the input has the cityscapes resolution hardcoded, should I change it to my resolution?

One extra question, for testing purposes my dataset only has one object, should the number of clases be 1 o the background counts as another class?

Thank you for your work.

chenwydj commented 3 years ago

Hi @DanielEliasib,

Thank you for your interest in our work!

  1. I agree that these error messages did not tell much, and I also suspect this issue may be related to the size of the feature map. I would encourage you to set a breakpoint at search/operations.py:127, and check the feature map size and the type of conv operator that raises this error. The most common error comes from the training patch size (height/width) is not divisible by 64, but if you use a size of 576x640 during training, then this should not be a problem. Also please remember to update your patch size and image size in the configure file.
  2. You do not have to keep the aspect ratio during training.
  3. For your binary segmentation, yes I suppose you can do that.

Hope these answers could help!