Closed liddk closed 1 year ago
感谢反馈,该问题我们这边已经记录,修复后会及时给你答复
@andyjpaddle 4卡AMP训练也会遇到相同问题。 python3 -u -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/det/det_mv3_db.yml -o Global.use_visualdl=True Global.use_amp=True Global.scale_loss=1024.0 Global.use_dynamic_loss_scaling=True Global.print_batch_step=1
[2022/08/22 05:53:21] ppocr INFO: epoch: [15/1200], global_step: 225, lr: 0.001000, loss: 5.017886, loss_shrink_maps: 3.560206, loss_threshold_maps: 0.923485, loss_binary_maps: 0.532439, avg_reader_cost: 4.36707 s, avg_batch_cost: 8.26320 s, avg_samples: 16.0, ips: 1.93630 samples/s, eta: 5:40:50 [2022/08/22 05:53:24] ppocr INFO: epoch: [15/1200], global_step: 226, lr: 0.001000, loss: 5.050126, loss_shrink_maps: 3.596053, loss_threshold_maps: 0.923485, loss_binary_maps: 0.532439, avg_reader_cost: 0.03259 s, avg_batch_cost: 2.84988 s, avg_samples: 16.0, ips: 5.61428 samples/s, eta: 5:43:18 Found inf or nan, current scale is: 4.1359030627651384e-25, decrease to: 4.1359030627651384e-250.5 [2022/08/22 05:53:26] ppocr INFO: epoch: [15/1200], global_step: 227, lr: 0.001000, loss: 5.017886, loss_shrink_maps: 3.560206, loss_threshold_maps: 0.923485, loss_binary_maps: 0.519505, avg_reader_cost: 1.81280 s, avg_batch_cost: 2.22363 s, avg_samples: 16.0, ips: 7.19544 samples/s, eta: 5:44:52 [2022/08/22 05:53:26] ppocr INFO: epoch: [15/1200], global_step: 228, lr: 0.001000, loss: 4.976384, loss_shrink_maps: 3.525608, loss_threshold_maps: 0.934624, loss_binary_maps: 0.505154, avg_reader_cost: 0.00012 s, avg_batch_cost: 0.39192 s, avg_samples: 16.0, ips: 40.82510 samples/s, eta: 5:43:53 Found inf or nan, current scale is: 2.0679515313825692e-25, decrease to: 2.0679515313825692e-250.5 [2022/08/22 05:53:27] ppocr INFO: epoch: [15/1200], global_step: 229, lr: 0.001000, loss: 4.976384, loss_shrink_maps: 3.525608, loss_threshold_maps: 0.923485, loss_binary_maps: 0.500928, avg_reader_cost: 0.00016 s, avg_batch_cost: 0.67728 s, avg_samples: 16.0, ips: 23.62399 samples/s, eta: 5:43:18 [2022/08/22 05:53:28] ppocr INFO: epoch: [15/1200], global_step: 230, lr: 0.001000, loss: 4.992502, loss_shrink_maps: 3.560206, loss_threshold_maps: 0.934624, loss_binary_maps: 0.505364, avg_reader_cost: 0.01024 s, avg_batch_cost: 0.31827 s, avg_samples: 16.0, ips: 50.27103 samples/s, eta: 5:42:13 Found inf or nan, current scale is: 1.0339757656912846e-25, decrease to: 1.0339757656912846e-250.5 [2022/08/22 05:53:28] ppocr INFO: epoch: [15/1200], global_step: 231, lr: 0.001000, loss: 5.019096, loss_shrink_maps: 3.588006, loss_threshold_maps: 0.934624, loss_binary_maps: 0.519505, avg_reader_cost: 0.00018 s, avg_batch_cost: 0.41058 s, avg_samples: 16.0, ips: 38.96882 samples/s, eta: 5:41:17 [2022/08/22 05:53:28] ppocr INFO: epoch: [15/1200], global_step: 232, lr: 0.001000, loss: 4.992502, loss_shrink_maps: 3.560206, loss_threshold_maps: 0.934624, loss_binary_maps: 0.519505, avg_reader_cost: 0.00028 s, avg_batch_cost: 0.18392 s, avg_samples: 16.0, ips: 86.99213 samples/s, eta: 5:40:03 Found inf or nan, current scale is: 5.169878828456423e-26, decrease to: 5.169878828456423e-260.5 [2022/08/22 05:53:29] ppocr INFO: epoch: [15/1200], global_step: 233, lr: 0.001000, loss: 5.019096, loss_shrink_maps: 3.588006, loss_threshold_maps: 0.938214, loss_binary_maps: 0.519505, avg_reader_cost: 0.00019 s, avg_batch_cost: 0.80283 s, avg_samples: 16.0, ips: 19.92946 samples/s, eta: 5:39:39 [2022/08/22 05:53:30] ppocr INFO: epoch: [15/1200], global_step: 234, lr: 0.001000, loss: 4.995037, loss_shrink_maps: 3.574678, loss_threshold_maps: 0.938214, loss_binary_maps: 0.511027, avg_reader_cost: 0.00014 s, avg_batch_cost: 0.77931 s, avg_samples: 16.0, ips: 20.53093 samples/s, eta: 5:39:14 Found inf or nan, current scale is: 2.5849394142282115e-26, decrease to: 2.5849394142282115e-260.5 [2022/08/22 05:53:31] ppocr INFO: epoch: [15/1200], global_step: 235, lr: 0.001000, loss: 5.026068, loss_shrink_maps: 3.602726, loss_threshold_maps: 0.938214, loss_binary_maps: 0.519245, avg_reader_cost: 0.09581 s, avg_batch_cost: 0.50636 s, avg_samples: 16.0, ips: 31.59786 samples/s, eta: 5:38:28 [2022/08/22 05:53:31] ppocr INFO: epoch: [15/1200], global_step: 236, lr: 0.001000, loss: 5.026068, loss_shrink_maps: 3.574678, loss_threshold_maps: 0.938214, loss_binary_maps: 0.519245, avg_reader_cost: 0.00014 s, avg_batch_cost: 0.17820 s, avg_samples: 16.0, ips: 89.78771 samples/s, eta: 5:37:15 Found inf or nan, current scale is: 1.2924697071141057e-26, decrease to: 1.2924697071141057e-260.5 [2022/08/22 05:53:31] ppocr INFO: epoch: [15/1200], global_step: 237, lr: 0.001000, loss: 5.055849, loss_shrink_maps: 3.602726, loss_threshold_maps: 0.938214, loss_binary_maps: 0.519245, avg_reader_cost: 0.00011 s, avg_batch_cost: 0.19194 s, avg_samples: 16.0, ips: 83.35977 samples/s, eta: 5:36:04 [2022/08/22 05:53:31] ppocr INFO: epoch: [15/1200], global_step: 238, lr: 0.001000, loss: 5.064687, loss_shrink_maps: 3.629196, loss_threshold_maps: 0.938214, loss_binary_maps: 0.527722, avg_reader_cost: 0.00014 s, avg_batch_cost: 0.18781 s, avg_samples: 16.0, ips: 85.19034 samples/s, eta: 5:34:53 Found inf or nan, current scale is: 6.462348535570529e-27, decrease to: 6.462348535570529e-270.5 [2022/08/22 05:53:31] ppocr INFO: epoch: [15/1200], global_step: 239, lr: 0.001000, loss: 5.055849, loss_shrink_maps: 3.602726, loss_threshold_maps: 0.932965, loss_binary_maps: 0.519245, avg_reader_cost: 0.00010 s, avg_batch_cost: 0.18466 s, avg_samples: 16.0, ips: 86.64496 samples/s, eta: 5:33:42 [2022/08/22 05:53:32] ppocr INFO: epoch: [15/1200], global_step: 240, lr: 0.001000, loss: 5.044233, loss_shrink_maps: 3.589090, loss_threshold_maps: 0.930141, loss_binary_maps: 0.519245, avg_reader_cost: 0.00009 s, avg_batch_cost: 0.13010 s, avg_samples: 10.0, ips: 76.86211 samples/s, eta: 5:32:28 [2022/08/22 05:53:32] ppocr INFO: save model in ./output/db_mv3/latest Found inf or nan, current scale is: 3.2311742677852644e-27, decrease to: 3.2311742677852644e-270.5 [2022/08/22 05:53:40] ppocr INFO: epoch: [16/1200], global_step: 241, lr: 0.001000, loss: 5.036156, loss_shrink_maps: 3.589090, loss_threshold_maps: 0.930141, loss_binary_maps: 0.511027, avg_reader_cost: 7.09716 s, avg_batch_cost: 8.44124 s, avg_samples: 16.0, ips: 1.89546 samples/s, eta: 5:42:09 [2022/08/22 05:53:41] ppocr INFO: epoch: [16/1200], global_step: 242, lr: 0.001000, loss: 5.014528, loss_shrink_maps: 3.569089, loss_threshold_maps: 0.924251, loss_binary_maps: 0.508085, avg_reader_cost: 0.00038 s, avg_batch_cost: 0.59627 s, avg_samples: 16.0, ips: 26.83352 samples/s, eta: 5:41:29 Found inf or nan, current scale is: 1.6155871338926322e-27, decrease to: 1.6155871338926322e-270.5 [2022/08/22 05:53:43] ppocr INFO: epoch: [16/1200], global_step: 243, lr: 0.001000, loss: 4.995037, loss_shrink_maps: 3.589090, loss_threshold_maps: 0.916466, loss_binary_maps: 0.503859, avg_reader_cost: 0.03652 s, avg_batch_cost: 2.02414 s, avg_samples: 16.0, ips: 7.90457 samples/s, eta: 5:42:42 [2022/08/22 05:53:44] ppocr INFO: epoch: [16/1200], global_step: 244, lr: 0.001000, loss: 5.013463, loss_shrink_maps: 3.599968, loss_threshold_maps: 0.920357, loss_binary_maps: 0.503859, avg_reader_cost: 0.00017 s, avg_batch_cost: 0.45545 s, avg_samples: 16.0, ips: 35.13011 samples/s, eta: 5:41:52 Found inf or nan, current scale is: 8.077935669463161e-28, decrease to: 8.077935669463161e-280.5 [2022/08/22 05:53:44] ppocr INFO: epoch: [16/1200], global_step: 245, lr: 0.001000, loss: 5.013463, loss_shrink_maps: 3.599968, loss_threshold_maps: 0.924393, loss_binary_maps: 0.503859, avg_reader_cost: 0.00017 s, avg_batch_cost: 0.48138 s, avg_samples: 16.0, ips: 33.23804 samples/s, eta: 5:41:04 [2022/08/22 05:53:44] ppocr INFO: epoch: [16/1200], global_step: 246, lr: 0.001000, loss: 5.013463, loss_shrink_maps: 3.599968, loss_threshold_maps: 0.928286, loss_binary_maps: 0.508085, avg_reader_cost: 0.00014 s, avg_batch_cost: 0.17853 s, avg_samples: 16.0, ips: 89.62308 samples/s, eta: 5:39:54 Found inf or nan, current scale is: 4.0389678347315804e-28, decrease to: 4.0389678347315804e-280.5 [2022/08/22 05:53:45] ppocr INFO: epoch: [16/1200], global_step: 247, lr: 0.001000, loss: 5.013463, loss_shrink_maps: 3.599968, loss_threshold_maps: 0.924393, loss_binary_maps: 0.503859, avg_reader_cost: 0.00017 s, avg_batch_cost: 0.38103 s, avg_samples: 16.0, ips: 41.99167 samples/s, eta: 5:38:59 [2022/08/22 05:53:45] ppocr INFO: epoch: [16/1200], global_step: 248, lr: 0.001000, loss: 5.031628, loss_shrink_maps: 3.610502, loss_threshold_maps: 0.924393, loss_binary_maps: 0.509523, avg_reader_cost: 0.00018 s, avg_batch_cost: 0.27947 s, avg_samples: 16.0, ips: 57.25109 samples/s, eta: 5:37:58 Found inf or nan, current scale is: 2.0194839173657902e-28, decrease to: 2.0194839173657902e-280.5 [2022/08/22 05:53:46] ppocr INFO: epoch: [16/1200], global_step: 249, lr: 0.001000, loss: 5.031628, loss_shrink_maps: 3.610502, loss_threshold_maps: 0.924393, loss_binary_maps: 0.509523, avg_reader_cost: 0.04839 s, avg_batch_cost: 0.59589 s, avg_samples: 16.0, ips: 26.85039 samples/s, eta: 5:37:21 [2022/08/22 05:53:46] ppocr INFO: epoch: [16/1200], global_step: 250, lr: 0.001000, loss: 5.031628, loss_shrink_maps: 3.610502, loss_threshold_maps: 0.924393, loss_binary_maps: 0.509523, avg_reader_cost: 0.00014 s, avg_batch_cost: 0.31812 s, avg_samples: 16.0, ips: 50.29549 samples/s, eta: 5:36:23 Found inf or nan, current scale is: 1.0097419586828951e-28, decrease to: 1.0097419586828951e-280.5 [2022/08/22 05:53:47] ppocr INFO: epoch: [16/1200], global_step: 251, lr: 0.001000, loss: 5.013463, loss_shrink_maps: 3.594380, loss_threshold_maps: 0.917541, loss_binary_maps: 0.499983, avg_reader_cost: 0.04486 s, avg_batch_cost: 0.73142 s, avg_samples: 16.0, ips: 21.87528 samples/s, eta: 5:35:56 [2022/08/22 05:53:47] ppocr INFO: epoch: [16/1200], global_step: 252, lr: 0.001000, loss: 5.014532, loss_shrink_maps: 3.594380, loss_threshold_maps: 0.917541, loss_binary_maps: 0.491992, avg_reader_cost: 0.00012 s, avg_batch_cost: 0.18831 s, avg_samples: 16.0, ips: 84.96449 samples/s, eta: 5:34:49 Found inf or nan, current scale is: 5.048709793414476e-29, decrease to: 5.048709793414476e-290.5 [2022/08/22 05:53:48] ppocr INFO: epoch: [16/1200], global_step: 253, lr: 0.001000, loss: 4.997432, loss_shrink_maps: 3.569089, loss_threshold_maps: 0.917541, loss_binary_maps: 0.491992, avg_reader_cost: 0.00012 s, avg_batch_cost: 0.18704 s, avg_samples: 16.0, ips: 85.54381 samples/s, eta: 5:33:43 [2022/08/22 05:53:48] ppocr INFO: epoch: [16/1200], global_step: 254, lr: 0.001000, loss: 5.014532, loss_shrink_maps: 3.594380, loss_threshold_maps: 0.911827, loss_binary_maps: 0.491992, avg_reader_cost: 0.00010 s, avg_batch_cost: 0.18920 s, avg_samples: 16.0, ips: 84.56503 samples/s, eta: 5:32:37 Found inf or nan, current scale is: 2.524354896707238e-29, decrease to: 2.524354896707238e-290.5 [2022/08/22 05:53:48] ppocr INFO: epoch: [16/1200], global_step: 255, lr: 0.001000, loss: 4.968390, loss_shrink_maps: 3.555958, loss_threshold_maps: 0.911827, loss_binary_maps: 0.490204, avg_reader_cost: 0.00010 s, avg_batch_cost: 0.20171 s, avg_samples: 16.0, ips: 79.31993 samples/s, eta: 5:31:33 [2022/08/22 05:53:48] ppocr INFO: epoch: [16/1200], global_step: 256, lr: 0.001000, loss: 4.968390, loss_shrink_maps: 3.555958, loss_threshold_maps: 0.908511, loss_binary_maps: 0.490204, avg_reader_cost: 0.00008 s, avg_batch_cost: 0.13288 s, avg_samples: 10.0, ips: 75.25305 samples/s, eta: 5:30:24 [2022/08/22 05:53:48] ppocr INFO: save model in ./output/db_mv3/latest Found inf or nan, current scale is: 1.262177448353619e-29, decrease to: 1.262177448353619e-290.5 [2022/08/22 05:53:58] ppocr INFO: epoch: [17/1200], global_step: 257, lr: 0.001000, loss: 4.919101, loss_shrink_maps: 3.525966, loss_threshold_maps: 0.908511, loss_binary_maps: 0.484178, avg_reader_cost: 7.46706 s, avg_batch_cost: 9.93894 s, avg_samples: 16.0, ips: 1.60983 samples/s, eta: 5:41:18 [2022/08/22 05:54:00] ppocr INFO: epoch: [17/1200], global_step: 258, lr: 0.001000, loss: 4.919101, loss_shrink_maps: 3.513160, loss_threshold_maps: 0.911827, loss_binary_maps: 0.484178, avg_reader_cost: 0.76209 s, avg_batch_cost: 1.93431 s, avg_samples: 16.0, ips: 8.27168 samples/s, eta: 5:42:20 Found inf or nan, current scale is: 6.310887241768095e-30, decrease to: 6.310887241768095e-300.5 [2022/08/22 05:54:01] ppocr INFO: epoch: [17/1200], global_step: 259, lr: 0.001000, loss: 4.958037, loss_shrink_maps: 3.525966, loss_threshold_maps: 0.917541, loss_binary_maps: 0.490204, avg_reader_cost: 0.00026 s, avg_batch_cost: 0.90969 s, avg_samples: 16.0, ips: 17.58848 samples/s, eta: 5:42:06 [2022/08/22 05:54:02] ppocr INFO: epoch: [17/1200], global_step: 260, lr: 0.001000, loss: 4.958037, loss_shrink_maps: 3.525966, loss_threshold_maps: 0.924393, loss_binary_maps: 0.490204, avg_reader_cost: 0.00015 s, avg_batch_cost: 0.52780 s, avg_samples: 16.0, ips: 30.31429 samples/s, eta: 5:41:25 Found inf or nan, current scale is: 3.1554436208840472e-30, decrease to: 3.1554436208840472e-300.5 [2022/08/22 05:54:02] ppocr INFO: epoch: [17/1200], global_step: 261, lr: 0.001000, loss: 4.958037, loss_shrink_maps: 3.525966, loss_threshold_maps: 0.929245, loss_binary_maps: 0.491992, avg_reader_cost: 0.00016 s, avg_batch_cost: 0.38202 s, avg_samples: 16.0, ips: 41.88283 samples/s, eta: 5:40:33 [2022/08/22 05:54:02] ppocr INFO: epoch: [17/1200], global_step: 262, lr: 0.001000, loss: 4.958037, loss_shrink_maps: 3.513160, loss_threshold_maps: 0.929245, loss_binary_maps: 0.491992, avg_reader_cost: 0.00017 s, avg_batch_cost: 0.28636 s, avg_samples: 16.0, ips: 55.87395 samples/s, eta: 5:39:34 Found inf or nan, current scale is: 1.5777218104420236e-30, decrease to: 1.5777218104420236e-300.5 [2022/08/22 05:54:03] ppocr INFO: epoch: [17/1200], global_step: 263, lr: 0.001000, loss: 4.988148, loss_shrink_maps: 3.513160, loss_threshold_maps: 0.931860, loss_binary_maps: 0.497072, avg_reader_cost: 0.00023 s, avg_batch_cost: 0.48528 s, avg_samples: 16.0, ips: 32.97056 samples/s, eta: 5:38:51 [2022/08/22 05:54:03] ppocr INFO: epoch: [17/1200], global_step: 264, lr: 0.001000, loss: 4.988148, loss_shrink_maps: 3.513160, loss_threshold_maps: 0.935539, loss_binary_maps: 0.502491, avg_reader_cost: 0.00015 s, avg_batch_cost: 0.17870 s, avg_samples: 16.0, ips: 89.53567 samples/s, eta: 5:37:45 Found inf or nan, current scale is: 7.888609052210118e-31, decrease to: 7.888609052210118e-310.5 [2022/08/22 05:54:04] ppocr INFO: epoch: [17/1200], global_step: 265, lr: 0.001000, loss: 4.988148, loss_shrink_maps: 3.507436, loss_threshold_maps: 0.935539, loss_binary_maps: 0.502491, avg_reader_cost: 0.00017 s, avg_batch_cost: 0.93466 s, avg_samples: 16.0, ips: 17.11847 samples/s, eta: 5:37:35
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem
系统环境/System Environment:X64 Nvidia V100s
版本号/Version:Paddle:2.3.1 PaddleOCR:release 2.5 2909454fb121a002290172525e414f3017923def 问题相关组件/Related components:Text detection
运行指令/Command Code: python3 -u tools/train.py -c configs/det/det_mv3_db.yml \ -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained \ Global.use_amp=True Global.scale_loss=1024.0 Global.use_dynamic_loss_scaling=True https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.5/doc/doc_en/detection_en.md#24-mixed-precision-training
完整报错/Complete Error Message:loss scale一直在缩小。 [2022/07/11 09:16:39] ppocr INFO: train with paddle 2.3.1 and device Place(gpu:0) [2022/07/11 09:16:39] ppocr INFO: Initialize indexs of datasets:['./train_data/icdar2015/text_localization/train_icdar2015_label.txt'] [2022/07/11 09:16:39] ppocr INFO: Initialize indexs of datasets:['./train_data/icdar2015/text_localization/test_icdar2015_label.txt'] W0711 09:16:39.307499 14970 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.6, Runtime API Version: 10.2 W0711 09:16:39.310671 14970 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6. [2022/07/11 09:16:40] ppocr INFO: load pretrain successful from ./pretrain_models/MobileNetV3_large_x0_5_pretrained [2022/07/11 09:16:40] ppocr INFO: train dataloader has 63 iters [2022/07/11 09:16:40] ppocr INFO: valid dataloader has 500 iters [2022/07/11 09:16:40] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 2000 iterations [2022/07/11 09:16:48] ppocr INFO: epoch: [1/1200], global_step: 10, lr: 0.001000, loss: 8.221910, loss_shrink_maps: 4.867956, loss_threshold_maps: 2.438209, loss_binary_maps: 0.969655, avg_reader_cost: 0.29622 s, avg_batch_cost: 0.82348 s, avg_samples: 16.0, ips: 19.42965 samples/s, eta: 17:17:27 [2022/07/11 09:16:52] ppocr INFO: epoch: [1/1200], global_step: 20, lr: 0.001000, loss: 7.021266, loss_shrink_maps: 4.803976, loss_threshold_maps: 1.252342, loss_binary_maps: 0.948038, avg_reader_cost: 0.00016 s, avg_batch_cost: 0.44537 s, avg_samples: 16.0, ips: 35.92532 samples/s, eta: 13:19:09 [2022/07/11 09:16:56] ppocr INFO: epoch: [1/1200], global_step: 30, lr: 0.001000, loss: 6.735630, loss_shrink_maps: 4.705022, loss_threshold_maps: 1.105840, loss_binary_maps: 0.914087, avg_reader_cost: 0.00018 s, avg_batch_cost: 0.41072 s, avg_samples: 16.0, ips: 38.95570 samples/s, eta: 11:45:08 [2022/07/11 09:17:01] ppocr INFO: epoch: [1/1200], global_step: 40, lr: 0.001000, loss: 6.225046, loss_shrink_maps: 4.398505, loss_threshold_maps: 1.049841, loss_binary_maps: 0.783037, avg_reader_cost: 0.00016 s, avg_batch_cost: 0.42521 s, avg_samples: 16.0, ips: 37.62831 samples/s, eta: 11:02:39 Found inf or nan, current scale is: 1024.0, decrease to: 1024.00.5 [2022/07/11 09:17:05] ppocr INFO: epoch: [1/1200], global_step: 50, lr: 0.001000, loss: 5.660814, loss_shrink_maps: 3.937335, loss_threshold_maps: 1.011920, loss_binary_maps: 0.684919, avg_reader_cost: 0.03221 s, avg_batch_cost: 0.43740 s, avg_samples: 16.0, ips: 36.57950 samples/s, eta: 10:40:12 Found inf or nan, current scale is: 512.0, decrease to: 512.00.5 Found inf or nan, current scale is: 256.0, decrease to: 256.00.5 Found inf or nan, current scale is: 128.0, decrease to: 128.00.5 Found inf or nan, current scale is: 64.0, decrease to: 64.00.5 [2022/07/11 09:17:09] ppocr INFO: epoch: [1/1200], global_step: 60, lr: 0.001000, loss: 5.263339, loss_shrink_maps: 3.674418, loss_threshold_maps: 1.004394, loss_binary_maps: 0.554835, avg_reader_cost: 0.04229 s, avg_batch_cost: 0.38422 s, avg_samples: 16.0, ips: 41.64236 samples/s, eta: 10:14:03 Found inf or nan, current scale is: 32.0, decrease to: 32.00.5 [2022/07/11 09:17:09] ppocr INFO: epoch: [1/1200], global_step: 63, lr: 0.001000, loss: 5.082424, loss_shrink_maps: 3.563838, loss_threshold_maps: 1.001389, loss_binary_maps: 0.540972, avg_reader_cost: 0.00002 s, avg_batch_cost: 0.04596 s, avg_samples: 4.0, ips: 87.03618 samples/s, eta: 9:53:58 [2022/07/11 09:17:09] ppocr INFO: save model in ./output/db_mv3/latest Found inf or nan, current scale is: 16.0, decrease to: 16.00.5 Found inf or nan, current scale is: 8.0, decrease to: 8.00.5 Found inf or nan, current scale is: 4.0, decrease to: 4.00.5 [2022/07/11 09:17:18] ppocr INFO: epoch: [2/1200], global_step: 70, lr: 0.001000, loss: 5.126446, loss_shrink_maps: 3.616309, loss_threshold_maps: 0.996725, loss_binary_maps: 0.540972, avg_reader_cost: 0.52638 s, avg_batch_cost: 0.84896 s, avg_samples: 11.2, ips: 13.19261 samples/s, eta: 11:27:12 Found inf or nan, current scale is: 2.0, decrease to: 2.00.5 Found inf or nan, current scale is: 1.0, decrease to: 1.00.5 Found inf or nan, current scale is: 0.5, decrease to: 0.50.5 Found inf or nan, current scale is: 0.25, decrease to: 0.250.5 [2022/07/11 09:17:22] ppocr INFO: epoch: [2/1200], global_step: 80, lr: 0.001000, loss: 5.208924, loss_shrink_maps: 3.684618, loss_threshold_maps: 0.984772, loss_binary_maps: 0.542171, avg_reader_cost: 0.02399 s, avg_batch_cost: 0.45255 s, avg_samples: 16.0, ips: 35.35532 samples/s, eta: 11:12:25 Found inf or nan, current scale is: 0.125, decrease to: 0.1250.5 Found inf or nan, current scale is: 0.0625, decrease to: 0.06250.5 Found inf or nan, current scale is: 0.03125, decrease to: 0.031250.5 Found inf or nan, current scale is: 0.015625, decrease to: 0.0156250.5 Found inf or nan, current scale is: 0.0078125, decrease to: 0.00781250.5 [2022/07/11 09:17:27] ppocr INFO: epoch: [2/1200], global_step: 90, lr: 0.001000, loss: 5.296472, loss_shrink_maps: 3.766433, loss_threshold_maps: 0.991639, loss_binary_maps: 0.546239, avg_reader_cost: 0.00649 s, avg_batch_cost: 0.39952 s, avg_samples: 16.0, ips: 40.04770 samples/s, eta: 10:53:29 Found inf or nan, current scale is: 0.00390625, decrease to: 0.003906250.5 Found inf or nan, current scale is: 0.001953125, decrease to: 0.0019531250.5 Found inf or nan, current scale is: 0.0009765625, decrease to: 0.00097656250.5 Found inf or nan, current scale is: 0.00048828125, decrease to: 0.000488281250.5 [2022/07/11 09:17:31] ppocr INFO: epoch: [2/1200], global_step: 100, lr: 0.001000, loss: 5.235338, loss_shrink_maps: 3.709980, loss_threshold_maps: 0.966067, loss_binary_maps: 0.554604, avg_reader_cost: 0.00449 s, avg_batch_cost: 0.41196 s, avg_samples: 16.0, ips: 38.83882 samples/s, eta: 10:39:54 Found inf or nan, current scale is: 0.000244140625, decrease to: 0.0002441406250.5 Found inf or nan, current scale is: 0.0001220703125, decrease to: 0.00012207031250.5 Found inf or nan, current scale is: 6.103515625e-05, decrease to: 6.103515625e-050.5 Found inf or nan, current scale is: 3.0517578125e-05, decrease to: 3.0517578125e-050.5 Found inf or nan, current scale is: 1.52587890625e-05, decrease to: 1.52587890625e-050.5 [2022/07/11 09:17:35] ppocr INFO: epoch: [2/1200], global_step: 110, lr: 0.001000, loss: 5.082040, loss_shrink_maps: 3.560674, loss_threshold_maps: 0.971970, loss_binary_maps: 0.557947, avg_reader_cost: 0.02313 s, avg_batch_cost: 0.42525 s, avg_samples: 16.0, ips: 37.62496 samples/s, eta: 10:30:17 Found inf or nan, current scale is: 7.62939453125e-06, decrease to: 7.62939453125e-060.5 Found inf or nan, current scale is: 3.814697265625e-06, decrease to: 3.814697265625e-060.5 Found inf or nan, current scale is: 1.9073486328125e-06, decrease to: 1.9073486328125e-060.5 Found inf or nan, current scale is: 9.5367431640625e-07, decrease to: 9.5367431640625e-070.5 Found inf or nan, current scale is: 4.76837158203125e-07, decrease to: 4.76837158203125e-070.5 [2022/07/11 09:17:39] ppocr INFO: epoch: [2/1200], global_step: 120, lr: 0.001000, loss: 5.189270, loss_shrink_maps: 3.632653, loss_threshold_maps: 0.977407, loss_binary_maps: 0.567796, avg_reader_cost: 0.02587 s, avg_batch_cost: 0.42274 s, avg_samples: 16.0, ips: 37.84829 samples/s, eta: 10:22:00 Found inf or nan, current scale is: 2.384185791015625e-07, decrease to: 2.384185791015625e-070.5 Found inf or nan, current scale is: 1.1920928955078125e-07, decrease to: 1.1920928955078125e-070.5 Found inf or nan, current scale is: 5.960464477539063e-08, decrease to: 5.960464477539063e-080.5 [2022/07/11 09:17:40] ppocr INFO: epoch: [2/1200], global_step: 126, lr: 0.001000, loss: 5.260098, loss_shrink_maps: 3.705512, loss_threshold_maps: 0.967512, loss_binary_maps: 0.563709, avg_reader_cost: 0.00007 s, avg_batch_cost: 0.09690 s, avg_samples: 8.8, ips: 90.81209 samples/s, eta: 10:02:01 [2022/07/11 09:17:41] ppocr INFO: save model in ./output/db_mv3/latest Found inf or nan, current scale is: 2.9802322387695312e-08, decrease to: 2.9802322387695312e-080.5 Found inf or nan, current scale is: 1.4901161193847656e-08, decrease to: 1.4901161193847656e-080.5 [2022/07/11 09:17:48] ppocr INFO: epoch: [3/1200], global_step: 130, lr: 0.001000, loss: 5.305344, loss_shrink_maps: 3.738985, loss_threshold_maps: 0.976945, loss_binary_maps: 0.581226, avg_reader_cost: 0.52584 s, avg_batch_cost: 0.77257 s, avg_samples: 6.4, ips: 8.28406 samples/s, eta: 10:58:13 Found inf or nan, current scale is: 7.450580596923828e-09, decrease to: 7.450580596923828e-090.5 Found inf or nan, current scale is: 3.725290298461914e-09, decrease to: 3.725290298461914e-090.5 Found inf or nan, current scale is: 1.862645149230957e-09, decrease to: 1.862645149230957e-090.5 Found inf or nan, current scale is: 9.313225746154785e-10, decrease to: 9.313225746154785e-100.5 Found inf or nan, current scale is: 4.656612873077393e-10, decrease to: 4.656612873077393e-100.5 [2022/07/11 09:17:53] ppocr INFO: epoch: [3/1200], global_step: 140, lr: 0.001000, loss: 5.249604, loss_shrink_maps: 3.718364, loss_threshold_maps: 0.986788, loss_binary_maps: 0.567631, avg_reader_cost: 0.12347 s, avg_batch_cost: 0.43702 s, avg_samples: 16.0, ips: 36.61140 samples/s, eta: 10:50:22 Found inf or nan, current scale is: 2.3283064365386963e-10, decrease to: 2.3283064365386963e-100.5 Found inf or nan, current scale is: 1.1641532182693481e-10, decrease to: 1.1641532182693481e-100.5 Found inf or nan, current scale is: 5.820766091346741e-11, decrease to: 5.820766091346741e-110.5 Found inf or nan, current scale is: 2.9103830456733704e-11, decrease to: 2.9103830456733704e-110.5 [2022/07/11 09:17:57] ppocr INFO: epoch: [3/1200], global_step: 150, lr: 0.001000, loss: 5.256428, loss_shrink_maps: 3.741612, loss_threshold_maps: 0.964204, loss_binary_maps: 0.586466, avg_reader_cost: 0.03230 s, avg_batch_cost: 0.45468 s, avg_samples: 16.0, ips: 35.18988 samples/s, eta: 10:45:03 Found inf or nan, current scale is: 1.4551915228366852e-11, decrease to: 1.4551915228366852e-110.5 Found inf or nan, current scale is: 7.275957614183426e-12, decrease to: 7.275957614183426e-120.5 Found inf or nan, current scale is: 3.637978807091713e-12, decrease to: 3.637978807091713e-120.5 Found inf or nan, current scale is: 1.8189894035458565e-12, decrease to: 1.8189894035458565e-120.5 Found inf or nan, current scale is: 9.094947017729282e-13, decrease to: 9.094947017729282e-130.5 [2022/07/11 09:18:01] ppocr INFO: epoch: [3/1200], global_step: 160, lr: 0.001000, loss: 5.478164, loss_shrink_maps: 3.850368, loss_threshold_maps: 0.975804, loss_binary_maps: 0.618165, avg_reader_cost: 0.01584 s, avg_batch_cost: 0.40239 s, avg_samples: 16.0, ips: 39.76208 samples/s, eta: 10:36:16 Found inf or nan, current scale is: 4.547473508864641e-13, decrease to: 4.547473508864641e-130.5 Found inf or nan, current scale is: 2.2737367544323206e-13, decrease to: 2.2737367544323206e-130.5 Found inf or nan, current scale is: 1.1368683772161603e-13, decrease to: 1.1368683772161603e-130.5 Found inf or nan, current scale is: 5.684341886080802e-14, decrease to: 5.684341886080802e-140.5 [2022/07/11 09:18:05] ppocr INFO: epoch: [3/1200], global_step: 170, lr: 0.001000, loss: 5.584779, loss_shrink_maps: 3.880050, loss_threshold_maps: 1.011591, loss_binary_maps: 0.651857, avg_reader_cost: 0.00963 s, avg_batch_cost: 0.38676 s, avg_samples: 16.0, ips: 41.36880 samples/s, eta: 10:27:22 Found inf or nan, current scale is: 2.842170943040401e-14, decrease to: 2.842170943040401e-140.5 Found inf or nan, current scale is: 1.4210854715202004e-14, decrease to: 1.4210854715202004e-140.5 Found inf or nan, current scale is: 7.105427357601002e-15, decrease to: 7.105427357601002e-150.5 [2022/07/11 09:18:09] ppocr INFO: epoch: [3/1200], global_step: 180, lr: 0.001000, loss: 5.951186, loss_shrink_maps: 4.058426, loss_threshold_maps: 1.086083, loss_binary_maps: 0.722890, avg_reader_cost: 0.01041 s, avg_batch_cost: 0.38568 s, avg_samples: 16.0, ips: 41.48559 samples/s, eta: 10:19:22 Found inf or nan, current scale is: 3.552713678800501e-15, decrease to: 3.552713678800501e-150.5 Found inf or nan, current scale is: 1.7763568394002505e-15, decrease to: 1.7763568394002505e-150.5 Found inf or nan, current scale is: 8.881784197001252e-16, decrease to: 8.881784197001252e-160.5 [2022/07/11 09:18:11] ppocr INFO: epoch: [3/1200], global_step: 189, lr: 0.001000, loss: 6.092818, loss_shrink_maps: 4.125790, loss_threshold_maps: 1.245787, loss_binary_maps: 0.762770, avg_reader_cost: 0.00253 s, avg_batch_cost: 0.17904 s, avg_samples: 13.6, ips: 75.96184 samples/s, eta: 10:01:43 [2022/07/11 09:18:11] ppocr INFO: save model in ./output/db_mv3/latest [2022/07/11 09:18:15] ppocr INFO: epoch: [4/1200], global_step: 190, lr: 0.001000, loss: 6.170770, loss_shrink_maps: 4.141250, loss_threshold_maps: 1.248021, loss_binary_maps: 0.767380, avg_reader_cost: 0.39041 s, avg_batch_cost: 0.41004 s, avg_samples: 1.6, ips: 3.90205 samples/s, eta: 10:25:40 Found inf or nan, current scale is: 4.440892098500626e-16, decrease to: 4.440892098500626e-160.5 Found inf or nan, current scale is: 2.220446049250313e-16, decrease to: 2.220446049250313e-160.5 Found inf or nan, current scale is: 1.1102230246251565e-16, decrease to: 1.1102230246251565e-160.5 Found inf or nan, current scale is: 5.551115123125783e-17, decrease to: 5.551115123125783e-170.5 [2022/07/11 09:18:21] ppocr INFO: epoch: [4/1200], global_step: 200, lr: 0.001000, loss: 6.144610, loss_shrink_maps: 4.135242, loss_threshold_maps: 1.254726, loss_binary_maps: 0.760699, avg_reader_cost: 0.15335 s, avg_batch_cost: 0.60302 s, avg_samples: 16.0, ips: 26.53325 samples/s, eta: 10:32:11 Found inf or nan, current scale is: 2.7755575615628914e-17, decrease to: 2.7755575615628914e-170.5 Found inf or nan, current scale is: 1.3877787807814457e-17, decrease to: 1.3877787807814457e-170.5 [2022/07/11 09:18:26] ppocr INFO: epoch: [4/1200], global_step: 210, lr: 0.001000, loss: 6.179070, loss_shrink_maps: 4.134474, loss_threshold_maps: 1.278543, loss_binary_maps: 0.763370, avg_reader_cost: 0.05691 s, avg_batch_cost: 0.47968 s, avg_samples: 16.0, ips: 33.35536 samples/s, eta: 10:30:42 [2022/07/11 09:18:31] ppocr INFO: epoch: [4/1200], global_step: 220, lr: 0.001000, loss: 6.515526, loss_shrink_maps: 4.268965, loss_threshold_maps: 1.428962, loss_binary_maps: 0.807194, avg_reader_cost: 0.03390 s, avg_batch_cost: 0.46141 s, avg_samples: 16.0, ips: 34.67609 samples/s, eta: 10:28:18 [2022/07/11 09:18:35] ppocr INFO: epoch: [4/1200], global_step: 230, lr: 0.001000, loss: 6.687632, loss_shrink_maps: 4.272030, loss_threshold_maps: 1.611205, loss_binary_maps: 0.818349, avg_reader_cost: 0.00737 s, avg_batch_cost: 0.40000 s, avg_samples: 16.0, ips: 40.00025 samples/s, eta: 10:22:45 Found inf or nan, current scale is: 6.938893903907228e-18, decrease to: 6.938893903907228e-180.5 [2022/07/11 09:18:40] ppocr INFO: epoch: [4/1200], global_step: 240, lr: 0.001000, loss: 6.629662, loss_shrink_maps: 4.176627, loss_threshold_maps: 1.611205, loss_binary_maps: 0.803029, avg_reader_cost: 0.05979 s, avg_batch_cost: 0.48557 s, avg_samples: 16.0, ips: 32.95097 samples/s, eta: 10:22:08 Found inf or nan, current scale is: 3.469446951953614e-18, decrease to: 3.469446951953614e-180.5 Found inf or nan, current scale is: 1.734723475976807e-18, decrease to: 1.734723475976807e-180.5 [2022/07/11 09:18:43] ppocr INFO: epoch: [4/1200], global_step: 250, lr: 0.001000, loss: 6.461374, loss_shrink_maps: 4.112732, loss_threshold_maps: 1.573650, loss_binary_maps: 0.755640, avg_reader_cost: 0.00745 s, avg_batch_cost: 0.29589 s, avg_samples: 16.0, ips: 54.07473 samples/s, eta: 10:12:02 [2022/07/11 09:18:43] ppocr INFO: epoch: [4/1200], global_step: 252, lr: 0.001000, loss: 6.377224, loss_shrink_maps: 4.042478, loss_threshold_maps: 1.526444, loss_binary_maps: 0.753372, avg_reader_cost: 0.00002 s, avg_batch_cost: 0.03050 s, avg_samples: 2.4, ips: 78.67676 samples/s, eta: 10:08:41 [2022/07/11 09:18:43] ppocr INFO: save model in ./output/db_mv3/latest [2022/07/11 09:18:52] ppocr INFO: epoch: [5/1200], global_step: 260, lr: 0.001000, loss: 6.459626, loss_shrink_maps: 4.055850, loss_threshold_maps: 1.647522, loss_binary_maps: 0.762871, avg_reader_cost: 0.46459 s, avg_batch_cost: 0.88483 s, avg_samples: 12.8, ips: 14.46609 samples/s, eta: 10:32:37 Found inf or nan, current scale is: 8.673617379884035e-19, decrease to: 8.673617379884035e-190.5 [2022/07/11 09:18:56] ppocr INFO: epoch: [5/1200], global_step: 270, lr: 0.001000, loss: 6.694904, loss_shrink_maps: 4.036129, loss_threshold_maps: 1.869650, loss_binary_maps: 0.773863, avg_reader_cost: 0.06750 s, avg_batch_cost: 0.39193 s, avg_samples: 16.0, ips: 40.82393 samples/s, eta: 10:27:20 Found inf or nan, current scale is: 4.336808689942018e-19, decrease to: 4.336808689942018e-190.5 Found inf or nan, current scale is: 2.168404344971009e-19, decrease to: 2.168404344971009e-190.5 [2022/07/11 09:19:00] ppocr INFO: epoch: [5/1200], global_step: 280, lr: 0.001000, loss: 6.529672, loss_shrink_maps: 4.011022, loss_threshold_maps: 1.789790, loss_binary_maps: 0.754615, avg_reader_cost: 0.02522 s, avg_batch_cost: 0.44044 s, avg_samples: 16.0, ips: 36.32769 samples/s, eta: 10:24:36 Found inf or nan, current scale is: 1.0842021724855044e-19, decrease to: 1.0842021724855044e-190.5 [2022/07/11 09:19:05] ppocr INFO: epoch: [5/1200], global_step: 290, lr: 0.001000, loss: 6.269986, loss_shrink_maps: 3.874740, loss_threshold_maps: 1.691643, loss_binary_maps: 0.717595, avg_reader_cost: 0.01197 s, avg_batch_cost: 0.41866 s, avg_samples: 16.0, ips: 38.21691 samples/s, eta: 10:21:06 [2022/07/11 09:19:09] ppocr INFO: epoch: [5/1200], global_step: 300, lr: 0.001000, loss: 6.355210, loss_shrink_maps: 3.766056, loss_threshold_maps: 1.702717, loss_binary_maps: 0.717541, avg_reader_cost: 0.00944 s, avg_batch_cost: 0.42088 s, avg_samples: 16.0, ips: 38.01559 samples/s, eta: 10:17:55 Found inf or nan, current scale is: 5.421010862427522e-20, decrease to: 5.421010862427522e-200.5 [2022/07/11 09:19:13] ppocr INFO: epoch: [5/1200], global_step: 310, lr: 0.001000, loss: 6.426630, loss_shrink_maps: 3.916010, loss_threshold_maps: 1.695006, loss_binary_maps: 0.743468, avg_reader_cost: 0.00975 s, avg_batch_cost: 0.39660 s, avg_samples: 16.0, ips: 40.34242 samples/s, eta: 10:13:58 [2022/07/11 09:19:14] ppocr INFO: epoch: [5/1200], global_step: 315, lr: 0.001000, loss: 6.435572, loss_shrink_maps: 3.916010, loss_threshold_maps: 1.665030, loss_binary_maps: 0.743468, avg_reader_cost: 0.00004 s, avg_batch_cost: 0.08695 s, avg_samples: 7.2, ips: 82.80697 samples/s, eta: 10:07:38 [2022/07/11 09:19:14] ppocr INFO: save model in ./output/db_mv3/latest [2022/07/11 09:19:21] ppocr INFO: epoch: [6/1200], global_step: 320, lr: 0.001000, loss: 6.414828, loss_shrink_maps: 3.916010, loss_threshold_maps: 1.665030, loss_binary_maps: 0.743468, avg_reader_cost: 0.46272 s, avg_batch_cost: 0.74409 s, avg_samples: 8.0, ips: 10.75140 samples/s, eta: 10:27:17 Found inf or nan, current scale is: 2.710505431213761e-20, decrease to: 2.710505431213761e-200.5 Found inf or nan, current scale is: 1.3552527156068805e-20, decrease to: 1.3552527156068805e-200.5 Found inf or nan, current scale is: 6.776263578034403e-21, decrease to: 6.776263578034403e-210.5