cubicimage commented 3 years ago

I found the only parameters related to training image size is in the file metadata.json as "patch_size": 256. If I need to train on 512x512 images, should I change the parameter to "patch_size": 512 ? is this enough?

After I did this, the training results show that the cd_precisions : 1.0, cd_recalls: 0.0, f1_scores: 0.1, what is the problem?

thanks in advance

likyoo commented 3 years ago

Could you provide more settings? I have tried 512*512, and it works.

cubicimage commented 3 years ago

Thanks for your replying.

the setting file metadata.json is as:

{ "patch_size": 512, "augmentation": true, "num_gpus": 1, "num_workers": 2, "num_channel": 3, "EF": false, "epochs": 100, "batch_size": 2, "learning_rate": 1e-3, "loss_function": "hybrid", "dataset_dir": "/media/root/Train_M2/Siamese/ChangeDetectionDataset/ChangeDetectionDataset/Model_packed/", "weight_dir": "./weights/", "log_dir": "./log/" }

the train and val procedure is as: INFO:root:SET model mode to train! epoch 1 info 15998 - 16000: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8000/8000 [1:02:09<00:00, 2.14it/s] INFO:root:EPOCH 1 TRAIN METRICS{'cd_losses': 0.7842114759609103, 'cd_corrects': 92.41285269260406, 'cd_precisions': 0.6882918446376274, 'cd_recalls': 0.0003181783352481297, 'cd_f1scores': 0.00011667645251179926, 'learning_rate': 0.0010000000000000005} INFO:root:EPOCH 1 VALIDATION METRICS{'cd_losses': 0.7814765771329403, 'cd_corrects': 92.29060640335084, 'cd_precisions': 0.989, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0010000000000000005} INFO:root:updata the model An epoch finished. INFO:root:SET model mode to train! epoch 2 info 15998 - 16000: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8000/8000 [1:01:05<00:00, 2.18it/s] INFO:root:EPOCH 2 TRAIN METRICS{'cd_losses': 0.775984840631485, 'cd_corrects': 92.43268837928773, 'cd_precisions': 0.9304801363001134, 'cd_recalls': 0.00017023063037293365, 'cd_f1scores': 0.00011852969875493522, 'learning_rate': 0.0010000000000000005} INFO:root:EPOCH 2 VALIDATION METRICS{'cd_losses': 0.7935021406710148, 'cd_corrects': 92.2906102180481, 'cd_precisions': 0.998, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0010000000000000005} INFO:root:updata the model An epoch finished. INFO:root:SET model mode to train! epoch 3 info 15998 - 16000: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8000/8000 [1:01:00<00:00, 2.19it/s] INFO:root:EPOCH 3 TRAIN METRICS{'cd_losses': 0.7728258773759007, 'cd_corrects': 92.4376312494278, 'cd_precisions': 0.9523574384180934, 'cd_recalls': 2.170985611196378e-05, 'cd_f1scores': 1.3498407529712986e-05, 'learning_rate': 0.0010000000000000005} INFO:root:EPOCH 3 VALIDATION METRICS{'cd_losses': 0.7749933919608593, 'cd_corrects': 92.29061374664306, 'cd_precisions': 1.0, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0010000000000000005} INFO:root:updata the model An epoch finished. INFO:root:SET model mode to train! epoch 4 info 15998 - 16000: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8000/8000 [1:00:57<00:00, 2.19it/s] INFO:root:EPOCH 4 TRAIN METRICS{'cd_losses': 0.7831544731110335, 'cd_corrects': 92.44011895656585, 'cd_precisions': 0.9682736536006232, 'cd_recalls': 0.0001438787379047262, 'cd_f1scores': 0.00010836227906231605, 'learning_rate': 0.0010000000000000005} INFO:root:EPOCH 4 VALIDATION METRICS{'cd_losses': 0.7880342887043953, 'cd_corrects': 92.29061355590821, 'cd_precisions': 0.999, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0010000000000000005} An epoch finished. INFO:root:SET model mode to train! epoch 5 info 15998 - 16000: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8000/8000 [1:01:05<00:00, 2.18it/s] INFO:root:EPOCH 5 TRAIN METRICS{'cd_losses': 0.7846483985334635, 'cd_corrects': 92.44571371078491, 'cd_precisions': 0.9996240412341407, 'cd_recalls': 2.2639462303240822e-05, 'cd_f1scores': 3.8317570176098656e-05, 'learning_rate': 0.0010000000000000005} INFO:root:EPOCH 5 VALIDATION METRICS{'cd_losses': 0.7875143143236637, 'cd_corrects': 92.29061374664306, 'cd_precisions': 1.0, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0010000000000000005} An epoch finished. INFO:root:SET model mode to train! epoch 6 info 15998 - 16000: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8000/8000 [1:01:05<00:00, 2.18it/s] INFO:root:EPOCH 6 TRAIN METRICS{'cd_losses': 0.7843611823022365, 'cd_corrects': 92.44578473567962, 'cd_precisions': 0.999875, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0010000000000000005} INFO:root:EPOCH 6 VALIDATION METRICS{'cd_losses': 0.7883539892733097, 'cd_corrects': 92.29061365127563, 'cd_precisions': 0.9995, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0010000000000000005} An epoch finished. INFO:root:SET model mode to train! epoch 7 info 15998 - 16000: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8000/8000 [1:01:05<00:00, 2.18it/s] INFO:root:EPOCH 7 TRAIN METRICS{'cd_losses': 0.787343169555068, 'cd_corrects': 92.44400928020477, 'cd_precisions': 0.998625, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0010000000000000005} INFO:root:EPOCH 7 VALIDATION METRICS{'cd_losses': 0.7875373746454716, 'cd_corrects': 92.29061374664306, 'cd_precisions': 1.0, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0005000000000000002} An epoch finished. INFO:root:SET model mode to train! epoch 8 info 15998 - 16000: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8000/8000 [1:01:08<00:00, 2.18it/s] INFO:root:EPOCH 8 TRAIN METRICS{'cd_losses': 0.7840627553984523, 'cd_corrects': 92.44578485488891, 'cd_precisions': 1.0, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0005000000000000002} INFO:root:EPOCH 8 VALIDATION METRICS{'cd_losses': 0.7886561053693295, 'cd_corrects': 92.29061374664306, 'cd_precisions': 1.0, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0005000000000000002} An epoch finished. INFO:root:SET model mode to train! epoch 9 info 15998 - 16000: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8000/8000 [1:01:04<00:00, 2.18it/s] INFO:root:EPOCH 9 TRAIN METRICS{'cd_losses': 0.783982153609395, 'cd_corrects': 92.44578485488891, 'cd_precisions': 1.0, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0005000000000000002} INFO:root:EPOCH 9 VALIDATION METRICS{'cd_losses': 0.7889383373260498, 'cd_corrects': 92.29061374664306, 'cd_precisions': 1.0, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0005000000000000002} An epoch finished. INFO:root:SET model mode to train! epoch 10 info 15998 - 16000: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8000/8000 [1:17:32<00:00, 1.72it/s] INFO:root:EPOCH 10 TRAIN METRICS{'cd_losses': 0.7840615466982126, 'cd_corrects': 92.44578485488891, 'cd_precisions': 1.0, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0005000000000000002} INFO:root:EPOCH 10 VALIDATION METRICS{'cd_losses': 0.7889520925283432, 'cd_corrects': 92.29061374664306, 'cd_precisions': 1.0, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0005000000000000002} An epoch finished. INFO:root:SET model mode to train! epoch 11 info 15998 - 16000: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8000/8000 [1:03:35<00:00, 2.10it/s] INFO:root:EPOCH 11 TRAIN METRICS{'cd_losses': 0.7840217581689358, 'cd_corrects': 92.44573352336883, 'cd_precisions': 0.993875, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0005000000000000002} INFO:root:EPOCH 11 VALIDATION METRICS{'cd_losses': 0.7875434209108353, 'cd_corrects': 92.29061355590821, 'cd_precisions': 0.9995, 'cd_recalls': 0.0, 'cd_f1scores': 0.0, 'learning_rate': 0.0005000000000000002} An epoch finished. INFO:root:SET model mode to train! epoch 12 info 10226 - 10228: 64%|██████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 5113/8000 40:34<23:59, 2.01it/s root@cubic-System-Product-Name:/media/root/Train_M2/Siamese/Siam-NestedUNet# cd /media/root/Train_M2/Siamese/Siam-NestedUNet ; /usr/bin/env /media/root/Train_M2/Siamese/Siam-NestedUNet/NUNet_env/bin/python /root/.vscode/extensions/ms-python.python-2021.8.1159798656/pythonFiles/lib/python/debugpy/launcher 40019 -- /media/root/Train_M2/Siamese/Siam-NestedUNet/eval.py INFO:root:STARTING Dataset Creation INFO:root:STARTING Dataloading (NUNet_env) root@cubic-System-Product-Name:/media/root/Train_M2/Siamese/Siam-NestedUNet# cd /media/root/Train_M2/Siamese/Siam-NestedUNet ; /usr/bin/env /media/root/Train_M2/Siamese/Siam-NestedUNet/NUNet_env/bin/python /root/.vscode/extensions/ms-python.python-2021.8.1159798656/pythonFiles/lib/python/debugpy/launcher 42867 -- /media/root/Train_M2/Siamese/Siam-NestedUNet/eval.py INFO:root:STARTING Dataset Creation INFO:root:STARTING Dataloading 0%| | 0/2000 [00:00<?, ?it/s]/media/root/Train_M2/Siamese/Siam-NestedUNet/NUNet_env/lib/python3.6/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.) return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode) 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2000/2000 [13:37<00:00, 2.45it/s] /media/root/Train_M2/Siamese/Siam-NestedUNet/eval.py:46: RuntimeWarning: invalid value encountered in long_scalars P = tp / (tp + fp) Precision: nan Recall: 0.0 F1-Score: nan

Could you provide more settings? I have tried 512*512, and it works.

likyoo commented 3 years ago

I think there are two possible reasons:

The batch size is too small
There may be serious data imbalances

cubicimage commented 3 years ago

Thank you!

Due to the gpu memory limit, the batch size is set to 2, which should be fine. I prefer the second reason, that the data problem. I will check it. Could you send me some dataset link which is 512x512, or your tested data?

likyoo / Siam-NestedUNet

how to train 512x512 images? #15

Thanks for your replying.