Closed xueyangkk closed 8 months ago
训练时候中断了 报错信息如下,使用的配置文件 PaddleDetection 2.6 版本 configs/yolov3/yolov3_mobilenet_v1_roadsign.yml python 3.8 GPU训练
我猜测这是脏数据引起的,之前训练都可以的 后来增加一批数据后报错如下
[06/28 22:28:10] ppdet.utils.checkpoint INFO: Finish loading model weights: C:\Users\admin/.cache/paddle/weights\yolov3_mobilenet_v1_270e_coco.pdparams [06/28 22:28:24] ppdet.engine INFO: Epoch: [0] [ 0/1800] learning_rate: 0.000033 loss_xy: 8.161053 loss_wh: 8.367204 loss_obj: 11268.683594 loss_cls: 25.210361 loss: 11310.421875 eta: 12 days, 4:15:24 batch_cost: 14.6128 data_cost: 0.0000 ips: 0.5475 images/s [06/28 22:28:53] ppdet.engine INFO: Epoch: [0] [ 20/1800] learning_rate: 0.000047 loss_xy: 8.716595 loss_wh: 6.026513 loss_obj: 42.985474 loss_cls: 23.787052 loss: 88.992310 eta: 1 day, 17:37:24 batch_cost: 1.4552 data_cost: 0.4778 ips: 5.4975 images/s [06/28 22:29:29] ppdet.engine INFO: Epoch: [0] [ 40/1800] learning_rate: 0.000060 loss_xy: 8.432473 loss_wh: 4.086940 loss_obj: 35.054878 loss_cls: 20.522554 loss: 69.305420 eta: 1 day, 14:25:04 batch_cost: 1.7542 data_cost: 0.9155 ips: 4.5605 images/s [06/28 22:30:06] ppdet.engine INFO: Epoch: [0] [ 60/1800] learning_rate: 0.000073 loss_xy: 8.480762 loss_wh: 3.381385 loss_obj: 19.048786 loss_cls: 18.307812 loss: 48.241104 eta: 1 day, 14:04:45 batch_cost: 1.8719 data_cost: 1.0622 ips: 4.2737 images/s [06/28 22:30:38] ppdet.engine INFO: Epoch: [0] [ 80/1800] learning_rate: 0.000087 loss_xy: 7.727617 loss_wh: 3.046140 loss_obj: 16.288818 loss_cls: 14.990237 loss: 42.075302 eta: 1 day, 12:32:03 batch_cost: 1.5945 data_cost: 0.7549 ips: 5.0174 images/s [06/28 22:31:11] ppdet.engine INFO: Epoch: [0] [ 100/1800] learning_rate: 0.000100 loss_xy: 7.994635 loss_wh: 2.754434 loss_obj: 13.801208 loss_cls: 12.051167 loss: 36.157940 eta: 1 day, 11:55:01 batch_cost: 1.6753 data_cost: 0.7810 ips: 4.7753 images/s [06/28 22:31:43] ppdet.engine INFO: Epoch: [0] [ 120/1800] learning_rate: 0.000100 loss_xy: 8.131168 loss_wh: 3.052801 loss_obj: 13.287282 loss_cls: 11.182236 loss: 35.465401 eta: 1 day, 11:12:15 batch_cost: 1.5853 data_cost: 0.6628 ips: 5.0463 images/s [06/28 22:32:31] ppdet.engine INFO: Epoch: [0] [ 140/1800] learning_rate: 0.000100 loss_xy: 7.563777 loss_wh: 2.575475 loss_obj: 12.552776 loss_cls: 9.570818 loss: 32.089897 eta: 1 day, 13:01:56 batch_cost: 2.4123 data_cost: 1.2899 ips: 3.3163 images/s [06/28 22:33:15] ppdet.engine INFO: Epoch: [0] [ 160/1800] learning_rate: 0.000100 loss_xy: 8.167475 loss_wh: 2.565861 loss_obj: 12.666830 loss_cls: 9.063219 loss: 31.950251 eta: 1 day, 13:46:10 batch_cost: 2.1568 data_cost: 0.8382 ips: 3.7092 images/s [06/28 22:33:53] ppdet.engine INFO: Epoch: [0] [ 180/1800] learning_rate: 0.000100 loss_xy: 7.886638 loss_wh: 2.721394 loss_obj: 13.510273 loss_cls: 9.057860 loss: 33.396034 eta: 1 day, 13:47:56 batch_cost: 1.9108 data_cost: 0.9279 ips: 4.1867 images/s [06/28 22:34:29] ppdet.engine INFO: Epoch: [0] [ 200/1800] learning_rate: 0.000100 loss_xy: 8.326336 loss_wh: 2.735133 loss_obj: 12.947336 loss_cls: 8.869627 loss: 33.055279 eta: 1 day, 13:34:41 batch_cost: 1.7887 data_cost: 0.7630 ips: 4.4726 images/s [06/28 22:35:01] ppdet.engine INFO: Epoch: [0] [ 220/1800] learning_rate: 0.000100 loss_xy: 8.354166 loss_wh: 2.529112 loss_obj: 12.337642 loss_cls: 8.584244 loss: 31.658493 eta: 1 day, 13:05:55 batch_cost: 1.6242 data_cost: 0.7925 ips: 4.9254 images/s [06/28 22:35:31] ppdet.engine INFO: Epoch: [0] [ 240/1800] learning_rate: 0.000100 loss_xy: 7.499451 loss_wh: 2.443300 loss_obj: 11.669539 loss_cls: 7.888125 loss: 29.991325 eta: 1 day, 12:31:32 batch_cost: 1.5205 data_cost: 0.8860 ips: 5.2615 images/s [06/28 22:36:03] ppdet.engine INFO: Epoch: [0] [ 260/1800] learning_rate: 0.000100 loss_xy: 7.472255 loss_wh: 2.409780 loss_obj: 12.110983 loss_cls: 7.969440 loss: 29.987885 eta: 1 day, 12:08:54 batch_cost: 1.5920 data_cost: 0.7320 ips: 5.0251 images/s [06/28 22:36:37] ppdet.engine INFO: Epoch: [0] [ 280/1800] learning_rate: 0.000100 loss_xy: 7.550937 loss_wh: 2.396636 loss_obj: 12.425622 loss_cls: 8.017889 loss: 30.960117 eta: 1 day, 11:56:01 batch_cost: 1.6696 data_cost: 0.6553 ips: 4.7915 images/s [06/28 22:37:11] ppdet.engine INFO: Epoch: [0] [ 300/1800] learning_rate: 0.000100 loss_xy: 7.220899 loss_wh: 2.285029 loss_obj: 10.711347 loss_cls: 7.458330 loss: 27.805481 eta: 1 day, 11:49:03 batch_cost: 1.7235 data_cost: 0.7382 ips: 4.6417 images/s [06/28 22:37:43] ppdet.engine INFO: Epoch: [0] [ 320/1800] learning_rate: 0.000100 loss_xy: 7.497707 loss_wh: 2.355329 loss_obj: 10.980699 loss_cls: 7.796483 loss: 28.556570 eta: 1 day, 11:31:14 batch_cost: 1.5671 data_cost: 0.7901 ips: 5.1050 images/s [06/28 22:38:17] ppdet.engine INFO: Epoch: [0] [ 340/1800] learning_rate: 0.000100 loss_xy: 7.436633 loss_wh: 2.223105 loss_obj: 11.777441 loss_cls: 7.848326 loss: 29.897726 eta: 1 day, 11:25:17 batch_cost: 1.7075 data_cost: 0.7549 ips: 4.6852 images/s [06/28 22:38:49] ppdet.engine INFO: Epoch: [0] [ 360/1800] learning_rate: 0.000100 loss_xy: 7.666862 loss_wh: 2.284117 loss_obj: 11.125872 loss_cls: 7.536785 loss: 28.401432 eta: 1 day, 11:12:52 batch_cost: 1.6009 data_cost: 0.7408 ips: 4.9971 images/s [06/28 22:39:27] ppdet.engine INFO: Epoch: [0] [ 380/1800] learning_rate: 0.000100 loss_xy: 7.282186 loss_wh: 2.159285 loss_obj: 11.435050 loss_cls: 7.537468 loss: 28.556595 eta: 1 day, 11:20:22 batch_cost: 1.8986 data_cost: 0.5518 ips: 4.2136 images/s [06/28 22:39:59] ppdet.engine INFO: Epoch: [0] [ 400/1800] learning_rate: 0.000100 loss_xy: 6.916970 loss_wh: 2.071035 loss_obj: 10.250597 loss_cls: 7.302767 loss: 26.508217 eta: 1 day, 11:11:15 batch_cost: 1.6331 data_cost: 0.7088 ips: 4.8987 images/s [06/28 22:40:34] ppdet.engine INFO: Epoch: [0] [ 420/1800] learning_rate: 0.000100 loss_xy: 7.672039 loss_wh: 2.190431 loss_obj: 11.384434 loss_cls: 7.603146 loss: 28.896381 eta: 1 day, 11:07:03 batch_cost: 1.7057 data_cost: 0.7347 ips: 4.6903 images/s [06/28 22:41:05] ppdet.engine INFO: Epoch: [0] [ 440/1800] learning_rate: 0.000100 loss_xy: 6.648771 loss_wh: 2.225920 loss_obj: 10.090831 loss_cls: 7.079162 loss: 26.239040 eta: 1 day, 10:55:59 batch_cost: 1.5722 data_cost: 0.7077 ips: 5.0885 images/s [06/28 22:41:41] ppdet.engine INFO: Epoch: [0] [ 460/1800] learning_rate: 0.000100 loss_xy: 7.387852 loss_wh: 2.227704 loss_obj: 10.702859 loss_cls: 7.366970 loss: 27.739613 eta: 1 day, 10:57:25 batch_cost: 1.7966 data_cost: 0.6833 ips: 4.4528 images/s [06/28 22:42:15] ppdet.engine INFO: Epoch: [0] [ 480/1800] learning_rate: 0.000100 loss_xy: 7.103598 loss_wh: 2.179836 loss_obj: 10.278258 loss_cls: 7.473765 loss: 27.222702 eta: 1 day, 10:53:20 batch_cost: 1.6886 data_cost: 0.6251 ips: 4.7377 images/s [06/28 22:42:48] ppdet.engine INFO: Epoch: [0] [ 500/1800] learning_rate: 0.000100 loss_xy: 8.051224 loss_wh: 2.427431 loss_obj: 11.223220 loss_cls: 8.016566 loss: 29.120567 eta: 1 day, 10:47:55 batch_cost: 1.6546 data_cost: 0.8777 ips: 4.8350 images/s [06/28 22:43:25] ppdet.engine INFO: Epoch: [0] [ 520/1800] learning_rate: 0.000100 loss_xy: 6.920986 loss_wh: 2.273045 loss_obj: 10.669683 loss_cls: 7.252974 loss: 27.516359 eta: 1 day, 10:51:37 batch_cost: 1.8458 data_cost: 0.7141 ips: 4.3342 images/s libpng error: IDAT: CRC error [06/28 22:43:27] reader WARNING: fail to map sample transform [Decode_198974] with error: 'bytes' object has no attribute 'shape' and stack: Traceback (most recent call last): File "D:\PythonWork\PaddleDetection\ppdet\data\reader.py", line 59, in __call__ data = f(data) File "D:\PythonWork\PaddleDetection\ppdet\data\transform\operators.py", line 103, in __call__ sample[i] = self.apply(sample[i], context) File "D:\PythonWork\PaddleDetection\ppdet\data\transform\operators.py", line 139, in apply sample['h'] = im.shape[0] AttributeError: 'bytes' object has no attribute 'shape' Exception in thread Thread-3: Traceback (most recent call last): File "D:\software\anaconda\anaconda3\envs\air\lib\threading.py", line 932, in _bootstrap_inner self.run() File "D:\software\anaconda\anaconda3\envs\air\lib\threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "D:\software\anaconda\anaconda3\envs\air\lib\site-packages\paddle\fluid\dataloader\dataloader_iter.py", line 217, in _thread_loop batch = self._dataset_fetcher.fetch(indices, File "D:\software\anaconda\anaconda3\envs\air\lib\site-packages\paddle\fluid\dataloader\fetcher.py", line 125, in fetch data.append(self.dataset[idx]) File "D:\PythonWork\PaddleDetection\ppdet\data\source\dataset.py", line 102, in __getitem__ return self.transform(roidb) File "D:\PythonWork\PaddleDetection\ppdet\data\reader.py", line 65, in __call__ raise e File "D:\PythonWork\PaddleDetection\ppdet\data\reader.py", line 59, in __call__ data = f(data) File "D:\PythonWork\PaddleDetection\ppdet\data\transform\operators.py", line 103, in __call__ sample[i] = self.apply(sample[i], context) File "D:\PythonWork\PaddleDetection\ppdet\data\transform\operators.py", line 139, in apply sample['h'] = im.shape[0] AttributeError: 'bytes' object has no attribute 'shape'
可以使用 try except
try except
@lyuwenyu 请问如何用try excep解决的?
问题确认 Search before asking
请提出你的问题 Please ask your question
训练时候中断了 报错信息如下,使用的配置文件 PaddleDetection 2.6 版本 configs/yolov3/yolov3_mobilenet_v1_roadsign.yml python 3.8 GPU训练
我猜测这是脏数据引起的,之前训练都可以的 后来增加一批数据后报错如下