Amshaker / unetr_plus_plus

[IEEE TMI-2024] UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation
Apache License 2.0
340 stars 32 forks source link

Why the loss of the validation set is nan on the acdc dataset? #54

Open liaochuanlin opened 12 months ago

liaochuanlin commented 12 months ago

2023-09-13 19:55:07.305611: epoch: 0 2023-09-13 20:09:55.016225: train loss : -0.0980 2023-09-13 20:10:50.444458: validation loss: nan 2023-09-13 20:10:50.445074: Average global foreground Dice: [0.0, 0.0, 0.0] 2023-09-13 20:10:50.445152: (interpret this as an estimate for the Dice of the different classes. This is not exact.) 2023-09-13 20:10:50.723239: lr: 0.009991 2023-09-13 20:10:50.723401: current best_val_eval_criterion_MA is 0.00000 2023-09-13 20:10:50.723435: current val_eval_criterion_MA is 0.0000 2023-09-13 20:10:50.723503: This epoch took 943.417670 s

2023-09-13 20:10:50.723532: epoch: 1 2023-09-13 20:25:20.248155: train loss : -0.4486 2023-09-13 20:26:16.659445: validation loss: nan 2023-09-13 20:26:16.659962: Average global foreground Dice: [0.0, 0.0, 0.0] 2023-09-13 20:26:16.660027: (interpret this as an estimate for the Dice of the different classes. This is not exact.) 2023-09-13 20:26:16.998715: lr: 0.009982 2023-09-13 20:26:16.998862: current best_val_eval_criterion_MA is 0.00000 2023-09-13 20:26:16.998900: current val_eval_criterion_MA is 0.0000 2023-09-13 20:26:16.998949: This epoch took 926.275386 s Why the loss of the validation set is nan on the acdc dataset

LimxRabbit commented 11 months ago

Hello~do you know how to solve it?

2023-09-13 19:55:07.305611: epoch: 0 2023-09-13 20:09:55.016225: train loss : -0.0980 2023-09-13 20:10:50.444458: validation loss: nan 2023-09-13 20:10:50.445074: Average global foreground Dice: [0.0, 0.0, 0.0] 2023-09-13 20:10:50.445152: (interpret this as an estimate for the Dice of the different classes. This is not exact.) 2023-09-13 20:10:50.723239: lr: 0.009991 2023-09-13 20:10:50.723401: current best_val_eval_criterion_MA is 0.00000 2023-09-13 20:10:50.723435: current val_eval_criterion_MA is 0.0000 2023-09-13 20:10:50.723503: This epoch took 943.417670 s

2023-09-13 20:10:50.723532: epoch: 1 2023-09-13 20:25:20.248155: train loss : -0.4486 2023-09-13 20:26:16.659445: validation loss: nan 2023-09-13 20:26:16.659962: Average global foreground Dice: [0.0, 0.0, 0.0] 2023-09-13 20:26:16.660027: (interpret this as an estimate for the Dice of the different classes. This is not exact.) 2023-09-13 20:26:16.998715: lr: 0.009982 2023-09-13 20:26:16.998862: current best_val_eval_criterion_MA is 0.00000 2023-09-13 20:26:16.998900: current val_eval_criterion_MA is 0.0000 2023-09-13 20:26:16.998949: This epoch took 926.275386 s Why the loss of the validation set is nan on the acdc dataset