YanchaoYang / FDA

Fourier Domain Adaptation for Semantic Segmentation
491 stars 79 forks source link

Question about the image transformation process #10

Closed liuquande closed 3 years ago

liuquande commented 3 years ago

Hi, @YanchaoYang , thanks for the nice work.

I have a question about the image transformation process.

In FDA_demo.py, the pixel values of source and target image are in range of 0-255. But after we transforming the source image to the target style by swapping the amplitude spectrum, the generated image are not in this range (for exmaple the pixel value is in range -119-181 when setting alpha as 0.001).

Would this cause any trouble for model training since the generated images and target images used for training are not in the same pixel value range?

Many thanks.

YanchaoYang commented 3 years ago

could you check the results generated by pytorch script? The images used for training are not generated by the demo file, so there might be a difference.

Also, in the demo file, after the "to image" operation, the generated images align with those from the pytorch train file.

liuquande commented 3 years ago

Hi, @YanchaoYang, thanks for your reply.

I see in the pytorch training script, you transform the image as bellow: ""

1. source to target, target to target

    src_in_trg = FDA_source_to_target( src_img, trg_img, L=args.LB )            # src_lbl
    trg_in_trg = trg_img

""

The pytorch implementation in "FDA_sourde_to_target" is of the same (or very similar) procudure as the numpy implementation, so perhaps there will not be much difference in their resulting images.

Have you ever checked the exact pixel values range of your generated image "src_in_trg", even though the visualization seems good? Are they in range of 0~255?

Btw, what does this mean: “after the "to image" operation, the generated images align with those from the pytorch train file.”

Many thanks and looking forward to your further reply.

YanchaoYang commented 3 years ago

In the FDA_demo.py, there is a to_image operation at the end to get the right range for display.

I have not checked the ranges, since the range will not be preserved in general, say, if the input image is 0-255, after transformation, it could be 0-198, the negative numbers might be some sparse out of range artifacts. The reason I ask to check the pytorch implementation, is to see if the pytorch one will generate less negative sparse artifacts.

IF those negative ones are just sparse artifacts, then should be okay to ignore them.

liuquande commented 3 years ago

Thanks for the reply.

I see that in the demo file, the image will be automatically clipped into range of 0-255 when saved. But in the training script, aftering transforming the image, there's no such operation to clip the generated image to this specific range.

I find that after transforming the source image to the target distribution, the regions of generated image that seems dark are usually of negative pixel values. While for the raw images in the target distribution, even though seems dark, their pixel values is at least 0.

So I'm not sure whether this inconsistent range will affect the model training, or it can be ignored?

Thanks.

YanchaoYang commented 3 years ago

Can you check if this is sparse?

If so, ignoring it may not be a problem.

liuquande commented 3 years ago

Screenshot from 2020-10-19 21-30-12

I print out the transformed image and the binary mask which denotes where all three channels are of negative values. Maybe these negative values could have some effect for model training but ignoring them could be okay.

YanchaoYang commented 3 years ago

I see, this should be okay since it is quite sparse and the original values are already close to zero.

Also, the transformation is performed with different target images, hopefully some of them will not be negative. Moreover, there are batchnorms in the network, so the effect should be further accommodated.

liuquande commented 3 years ago

Thanks for the clarification. WIll close the issue.