VITA-Group / FasterSeg

[ICLR 2020] "FasterSeg: Searching for Faster Real-time Semantic Segmentation" by Wuyang Chen, Xinyu Gong, Xianming Liu, Qian Zhang, Yuan Li, Zhangyang Wang
MIT License
526 stars 107 forks source link

How to train FasterSeg with customized labels? #46

Closed khaep closed 3 years ago

khaep commented 3 years ago

Hello, it's very interesting to train and use FasterSeg with own custom data. To get information about that, I read the comments of this issue description.

Based on the description linked above, I did the following steps:

  1. prepare my own labels in labels.py
  2. create my own annotated pictures with createTrainIdLabelImgs.py 2.1 make sure that hight and width are divisible by 32
  3. prepare file name lists as "dataset files" for data loading see examples here
  4. adjust the config file with number of classes

Are there any other points in the FasterSeg repository which I need to adjust for using customized labels?

For example: In cityscapes.py, camvid.py and bdd.py are some methodes like get_class_colors() and get_class_names() which return color or class names of the cityscapes data.

Is it neccessary to add the customized labels to this methods? For which purpose are this methods?

It would be great if you can give me some hints to answer this questions, so I can run the training process with customized labels.

Best regards

chenwydj commented 3 years ago

Hi @KHeap25 !

Thank you very much for your interests in our work!

First, I have to admit that I have never run this repo for a brand new dataset from scratch. What I can do is try to help you with any bug or problem.

Some comments to your steps:

  1. As I remember the gound truth files in cityscapes/bdd/camvid are single-channel image files with integers. Each integer represents the class index for each pixel. That means as long as you prepare your ground truth files in the same way, you can run our code and don't have to write your own labels.py and createTrainIdLabelImgs.py (unless you want to make your dataset as standard as cityscapes)
  2. During training please make the height and width of cropped patches divisible by 64. Sorry I made mistakes in responses to previous issues (I already corrected them).
  3. It is recommended to implement your get_class_names() function, as it will be used to print the results to the terminal (see here for usage). The function get_class_colors() is only used when you want to generate predictions in color (something like the front teaser images in our Readme) (see here for usage).

Hope that helps.

khaep commented 3 years ago

Thanks @chenwydj for your reply!

In relation to your recommendation above, I implement a new version of the get_class_colors() and the get_class_names() methods in cityscapes.py, camvid.py and bdd.py based on the customized labels. Furthermore there are some lines in test.py that need to be adjust for using the custom labels.

After passing the step pretrain the supernet, I got the following error during the step search the architecture.

image

It looks like something with the "TensorRT" latency test went wrong. Maybe in darts_utils.py. Do you have an idea how this error can be fixed?

If I remove "TensorRT" from the system, I am able to run all the training and evaluation steps. But then, PyTorch is used for the latency tests.

image

Are there any important differences between TensorRT and PyTorch for the latency test?

I look forward to hearnig from you.

Kind regards

chenwydj commented 3 years ago

I would guess the tensorrt meets some problem, although you have installed it.

Here is the place where the function using tensorrt is imported, you may want to comment out this part: https://github.com/VITA-Group/FasterSeg/blob/master/search/operations.py#L25

khaep commented 3 years ago

Hi @chenwydj,

firstly let me summarize the points which I adjusted for using own customized labels.

After doing the steps, I was able to run a training process with customized labels (with PyTorch for latency test).

Based on that, I have some questions about the logger for TensorBoard monitoring.

It would be great if you can give me some advice, which would help me to understand the results of the training steps better.

Kind regards

chenwydj commented 3 years ago

Hi @KHeap25,

The "objective" indicates Eq. 5 in our Appendix B. It is a combined target of accuracy and latency, adopted from Tan et al., 2019.

The other parts is FPS, and the code is here. Arch0 indicates teacher net, arch1 the student. FPS0 indicates the architecture that aggregates outputs from [1/8, 1/32] branches, and FPS1 the aggregation from [1/16, 1/32].

Hope that helps!

i-am-nut commented 2 years ago

@KHeap25 can you share your code to see how your modifications got afterall?

Gaussianer commented 2 years ago

Hey @emersonjr,

@KHeap25 was in the project with me. You can find our code here: https://github.com/Gaussianer/FasterSeg