openseg-group / openseg.pytorch

The official Pytorch implementation of OCNet, OCRNet, and SegFix.
MIT License
1.19k stars 140 forks source link

details about the dt_offset_generator.py #62

Closed cnnAndBn closed 3 years ago

cnnAndBn commented 3 years ago

hi: @hsfzxjy @PkuRainBow now ,i am reading your code for segfix. some confusion to ask.

  1. in https://github.com/openseg-group/openseg.pytorch/blob/2c459f3b42deee26194f1802f353887d945e14c4/lib/datasets/preprocess/cityscapes/dt_offset_generator.py#L91 deg_reduce = 2, the computed direction degree is divided by the param deg_reduce . can you explain it ? in the paper , you said : "Considering that there might be some \fake" interior pixels 6 when the boundary is thick, we propose two different schemes as following: (i) re-scaling all the offsets by a factor, e.g., 2.". but it seems the 2 in paper is not related to deg_reduce here? the re-scaling in the paper can be interpreted as the concept stride or step?
    1. in https://github.com/openseg-group/openseg.pytorch/blob/2c459f3b42deee26194f1802f353887d945e14c4/lib/datasets/preprocess/cityscapes/dt_offset_generator.py#L92 180 is added to computed degree ,why?
    2. lastly, after visulizing the sobel_x,sobel_y, and the distance map(i.e. the result of distrance transformation), I think the direction should be like this ? In this example, the direction along x is from outside to interior, but the direction along y is oppsite ,right? I think it should be from outside to interior as well along y ,right? Can you explain it ? thanks

微信图片_20210511115333

hsfzxjy commented 3 years ago

@dadada101 Thanks for your interests in our work!

The snippets you mentioned in Question 1 and 2 are for storage optimization. Before persisting the offset maps, we shift the angle values into interval [0, 360] and then divide by 2. The operation ensures that value of each location can be fit into a single byte, and consumes less disk space. In dataloader, we perform the inversed operation (see here). So above all, our implemention is consistent with our paper.

The vectors from Sobel operator should always point from pixels with low values to pixels with higher values. You might inverse the operation mentioned above before you do the visualization.