About model training - Githubissues

liaochuanlin commented 12 months ago

Does the author test excessive category segmentation? When I train with brain data Brats2018, the output dice is 0, but I can't find the reason all the time. Can you help me? Thank you very much

Yanfeng-Zhou commented 12 months ago

Can you give a specific example, are you using XNet or other networks, semi-supervised segmentation or fully supervised segmentation? I suggest that you try to start with the fully supervised UNet. If UNet cannot be trained, it may be a problem with the hyperparameters or the dataset.

liaochuanlin commented 12 months ago

Dear author, I use the brain ct dataset (Brats2018). Is it possible that the data should not be randomly clipped in the data loading process? At first, I thought it was the order of data loading (the high and low frequency loaded data did not match the loaded original tag, because I used two datalod, which were loaded separately, and when the dice was set to 0 after sequential loading), I used a fully supervised network（XNET）.

Yanfeng-Zhou commented 12 months ago

3D images are trained based on patches. If two dataloads are used to load low-frequency images and high-frequency images respectively, there is no guarantee that the generated patches will be the same. You should use one dataload to load them at the same time to ensure that the patches generated by L, H and mask are the same. （see dataload/dataset_3d.py line 163-172） In addition, the data augmentation used in the data loading process should be exactly the same, such as random cropping, which should ensure that the cropping is at the same position (this cannot be guaranteed with two dataloads).

liaochuanlin commented 12 months ago

Thank you for your suggestion. I will follow your suggestion to implement it.

liaochuanlin commented 12 months ago

Dear author, I made changes as you suggested, only to find that in the output, wt_dice training is normal, tc_dice and et_dice are completely zero. What could have caused this result? Batch:23/40 et_dice: 0.0000 (0.0000) et_loss: 2.0346 (2.2852) loss: 5.3594 (6.2019) tc_dice: 0.0000 (0.0000) tc_loss: 2.0476 (2.5506) wt_dice: 0.7888 (0.7607) wt_loss: 1.2772 (1.3661) Time: 0.213 2023-10-11 22:37 Epoch: 20, Batch:24/40 et_dice: 0.0000 (0.0000) et_loss: 2.2843 (2.2851) loss: 5.9994 (6.1938) tc_dice: 0.0000 (0.0000) tc_loss: 2.2675 (2.5393) wt_dice: 0.6497 (0.7563) wt_loss: 1.4476 (1.3694) Time: 1.936 2023-10-11 22:37 Epoch: 20, Batch:25/40 et_dice: 0.0000 (0.0000) et_loss: 2.1667 (2.2806) loss: 5.9239 (6.1834) tc_dice: 0.0000 (0.0000) tc_loss: 2.4776 (2.5369) wt_dice: 0.8184 (0.7587) wt_loss: 1.2796 (1.3659) Time: 0.195 2023-10-11 22:37 Epoch: 20, Batch:26/40 et_dice: 1.0000 (0.0370) et_loss: 1.0002 (2.2332) loss: 5.1437 (6.1449) tc_dice: 0.0000 (0.0000) tc_loss: 2.9297 (2.5515) wt_dice: 0.8613 (0.7625) wt_loss: 1.2138 (1.3603) Time: 1.316 2023-10-11 22:37 Epoch:20, Batch:27/40 et_dice: 0.0000 (0.0357) et_loss: 2.1196 (2.2291) loss: 5.8126 (6.1331) tc_dice: 0.0000 (0.0000) tc_loss: 2.1705 (2.5379) wt_dice: 0.5883 (0.7563) wt_loss: 1.5225 (1.3661) Time: 0.209 2023-10-11 22:37 Epoch: 20, Batch:28/40 et_dice: 0.0000 (0.0345) et_loss: 2.1827 (2.2275) loss: 5.9236 (6.1258) tc_dice: 0.0000 (0.0000) tc_loss: 2.1989 (2.5262) wt_dice: 0.6671 (0.7532) wt_loss: 1.5420 (1.3722) Time: 0.206 2023-10-11 22:37 Epoch: 20, Batch:29/40 et_dice: 0.0000 (0.0333) et_loss: 2.2075 (2.2268) loss: 5.9188 (6.1189) tc_dice: 0.0000 (0.0000) tc_loss: 2.2257 (2.5162) wt_dice: 0.7339 (0.7526) wt_loss: 1.4856 (1.3759) Time: 0.209 2023-10-11 22:37 Epoch: 20, Batch:30/40 et_dice: 0.0000 (0.0323) et_loss: 2.7045 (2.2423) loss: 7.0751 (6.1498) tc_dice: 0.0000 (0.0000) tc_loss: 2.8100 (2.5256) wt_dice: 0.7901 (0.7538) wt_loss: 1.5605 (1.3819) Time: 0.215 2023-10-11 22:37 Epoch: 20, Batch:31/40 et_dice: 0.0000 (0.0313) et_loss: 2.1602 (2.2397) loss: 6.4074 (6.1578) tc_dice: 0.0000 (0.0000) tc_loss: 2.4582 (2.5235) wt_dice: 0.5469 (0.7473) wt_loss: 1.7889 (1.3946) Time: 0.210

Yanfeng-Zhou commented 11 months ago

During your training process, there are normal losses and evaluation metrics, which proves that the overall training process should be correct. I think the error should appear in tc_dice and et_dice.

Yanfeng-Zhou / XNet

About model training #1