Yanfeng-Zhou / XNet

[ICCV2023] XNet: Wavelet-Based Low and High Frequency Merging Networks for Semi- and Supervised Semantic Segmentation of Biomedical Images
MIT License
184 stars 8 forks source link

About the final prediction #3

Closed xuanli98 closed 11 months ago

xuanli98 commented 11 months ago

Thanks for your great work : XNet: Wavelet-Based Low and High Frequency Fusion Networks for Fully- and Semi-Supervised Semantic Segmentation of Biomedical Images. I have some questions about the final prediction of your XNet and the additive noise.

  1. XNet has two outputs P_L and P_H. I read your code and find that the final prediction is decided by the "args.result" .Would you like to tell me which output is the final prediction and how you choose "args.result"? Thank you very much.

image

  1. Would you like to tell me why the LF and HF encoders do not encode the additive noise.

image

Yanfeng-Zhou commented 11 months ago
  1. Your question comes from lines 141-148 of text_xnet.py. For lines 141-144, this actually introduces the previously studied CCT strategy into XNet (but we do not open the code to add the CCT strategy to XNet, so if_cct should be set at False). As for lines 145-148, as stated in our paper, the model automatically selects the branch that performed better during training as the final prediction, and tells you which branch has better results at the end of training (that is, it prints the best branch is result1 or result2)
  2. In fact, we cannot theoretically guarantee that the encoder will not encode noise, but for segmentation tasks, accurate segmentation requires that the encoder can extract information that is beneficial to segmentation from the image and avoid noise interference. In summary, this is the ideal effect we expect the XNet encoder to achieve. In fact, judging from its performance in experiments, it has indeed achieved it.
xuanli98 commented 11 months ago

Thank you very much!