Yanfeng-Zhou / XNet

[ICCV2023] XNet: Wavelet-Based Low and High Frequency Merging Networks for Semi- and Supervised Semantic Segmentation of Biomedical Images
MIT License
190 stars 9 forks source link

About model training #1

Open liaochuanlin opened 1 year ago

liaochuanlin commented 1 year ago

Does the author test excessive category segmentation? When I train with brain data Brats2018, the output dice is 0, but I can't find the reason all the time. Can you help me? Thank you very much

Yanfeng-Zhou commented 1 year ago

Can you give a specific example, are you using XNet or other networks, semi-supervised segmentation or fully supervised segmentation? I suggest that you try to start with the fully supervised UNet. If UNet cannot be trained, it may be a problem with the hyperparameters or the dataset.

liaochuanlin commented 1 year ago

Dear author, I use the brain ct dataset (Brats2018). Is it possible that the data should not be randomly clipped in the data loading process? At first, I thought it was the order of data loading (the high and low frequency loaded data did not match the loaded original tag, because I used two datalod, which were loaded separately, and when the dice was set to 0 after sequential loading), I used a fully supervised network(XNET).

Yanfeng-Zhou commented 1 year ago

3D images are trained based on patches. If two dataloads are used to load low-frequency images and high-frequency images respectively, there is no guarantee that the generated patches will be the same. You should use one dataload to load them at the same time to ensure that the patches generated by L, H and mask are the same. (see dataload/dataset_3d.py line 163-172) In addition, the data augmentation used in the data loading process should be exactly the same, such as random cropping, which should ensure that the cropping is at the same position (this cannot be guaranteed with two dataloads).

liaochuanlin commented 1 year ago

Thank you for your suggestion. I will follow your suggestion to implement it.

liaochuanlin commented 1 year ago

Dear author, I made changes as you suggested, only to find that in the output, wt_dice training is normal, tc_dice and et_dice are completely zero. What could have caused this result? Batch:23/40 et_dice: 0.0000 (0.0000) et_loss: 2.0346 (2.2852) loss: 5.3594 (6.2019) tc_dice: 0.0000 (0.0000) tc_loss: 2.0476 (2.5506) wt_dice: 0.7888 (0.7607) wt_loss: 1.2772 (1.3661) Time: 0.213 2023-10-11 22:37 Epoch: 20, Batch:24/40 et_dice: 0.0000 (0.0000) et_loss: 2.2843 (2.2851) loss: 5.9994 (6.1938) tc_dice: 0.0000 (0.0000) tc_loss: 2.2675 (2.5393) wt_dice: 0.6497 (0.7563) wt_loss: 1.4476 (1.3694) Time: 1.936 2023-10-11 22:37 Epoch: 20, Batch:25/40 et_dice: 0.0000 (0.0000) et_loss: 2.1667 (2.2806) loss: 5.9239 (6.1834) tc_dice: 0.0000 (0.0000) tc_loss: 2.4776 (2.5369) wt_dice: 0.8184 (0.7587) wt_loss: 1.2796 (1.3659) Time: 0.195 2023-10-11 22:37 Epoch: 20, Batch:26/40 et_dice: 1.0000 (0.0370) et_loss: 1.0002 (2.2332) loss: 5.1437 (6.1449) tc_dice: 0.0000 (0.0000) tc_loss: 2.9297 (2.5515) wt_dice: 0.8613 (0.7625) wt_loss: 1.2138 (1.3603) Time: 1.316 2023-10-11 22:37 Epoch:20, Batch:27/40 et_dice: 0.0000 (0.0357) et_loss: 2.1196 (2.2291) loss: 5.8126 (6.1331) tc_dice: 0.0000 (0.0000) tc_loss: 2.1705 (2.5379) wt_dice: 0.5883 (0.7563) wt_loss: 1.5225 (1.3661) Time: 0.209 2023-10-11 22:37 Epoch: 20, Batch:28/40 et_dice: 0.0000 (0.0345) et_loss: 2.1827 (2.2275) loss: 5.9236 (6.1258) tc_dice: 0.0000 (0.0000) tc_loss: 2.1989 (2.5262) wt_dice: 0.6671 (0.7532) wt_loss: 1.5420 (1.3722) Time: 0.206 2023-10-11 22:37 Epoch: 20, Batch:29/40 et_dice: 0.0000 (0.0333) et_loss: 2.2075 (2.2268) loss: 5.9188 (6.1189) tc_dice: 0.0000 (0.0000) tc_loss: 2.2257 (2.5162) wt_dice: 0.7339 (0.7526) wt_loss: 1.4856 (1.3759) Time: 0.209 2023-10-11 22:37 Epoch: 20, Batch:30/40 et_dice: 0.0000 (0.0323) et_loss: 2.7045 (2.2423) loss: 7.0751 (6.1498) tc_dice: 0.0000 (0.0000) tc_loss: 2.8100 (2.5256) wt_dice: 0.7901 (0.7538) wt_loss: 1.5605 (1.3819) Time: 0.215 2023-10-11 22:37 Epoch: 20, Batch:31/40 et_dice: 0.0000 (0.0313) et_loss: 2.1602 (2.2397) loss: 6.4074 (6.1578) tc_dice: 0.0000 (0.0000) tc_loss: 2.4582 (2.5235) wt_dice: 0.5469 (0.7473) wt_loss: 1.7889 (1.3946) Time: 0.210

Yanfeng-Zhou commented 1 year ago

During your training process, there are normal losses and evaluation metrics, which proves that the overall training process should be correct. I think the error should appear in tc_dice and et_dice.