PaddlePaddle / PaddleX

Low-code development tool based on PaddlePaddle(飞桨低代码开发工具)
Apache License 2.0
4.76k stars 936 forks source link

模型训练问题 #731

Open chccc1994 opened 3 years ago

chccc1994 commented 3 years ago

问题类型:模型训练
问题描述

====================

利用AI studio 上分类问题PaddleX快速上手-MobileNetV2图像分类,在本地做测试,发现GPU和CPU 下都是训练一个epoch 就中断,都没有报错信息。


import matplotlib
matplotlib.use('Agg') 
# 设置使用0号GPU卡(如无GPU,执行此代码后仍然会使用CPU训练模型)
import os
#os.environ['CUDA_VISIBLE_DEVICES'] = '0'
import paddlex as pdx

from paddlex.cls import transforms

train_transforms = transforms.Compose([
    transforms.RandomCrop(crop_size=224),
    transforms.RandomHorizontalFlip(),
    transforms.Normalize()
])
eval_transforms = transforms.Compose([
    transforms.ResizeByShort(short_size=256),
    transforms.CenterCrop(crop_size=224),
    transforms.Normalize()
])

train_dataset = pdx.datasets.ImageNet(
    data_dir='D:/Github/PaddleX/mini_imagenet_veg',
    file_list='D:/Github/PaddleX/mini_imagenet_veg/train_list.txt',
    label_list='D:/Github/PaddleX/mini_imagenet_veg/labels.txt',
    transforms=train_transforms)
eval_dataset = pdx.datasets.ImageNet(
    data_dir='D:/Github/PaddleX/mini_imagenet_veg',
    file_list='D:/Github/PaddleX/mini_imagenet_veg/val_list.txt',
    label_list='D:/Github/PaddleX/mini_imagenet_veg/labels.txt',
    transforms=eval_transforms)

num_classes = len(train_dataset.labels)
model = pdx.cls.MobileNetV3_large_ssld(num_classes=num_classes)
model.train(num_epochs=12,
            train_dataset=train_dataset,
            train_batch_size=4,
            eval_dataset=eval_dataset,
            lr_decay_epochs=[6, 8],
            save_interval_epochs=1,
            learning_rate=0.00625,
            pretrain_weights ='D:/Github/PaddleX/output/mobilenetv3_large_ssld/pretrain/MobileNetV3_large_x1_0_ssld',
            save_dir='D:/Github/PaddleX/output/mobilenetv3_large_ssld',
            use_vdl=True)
PS D:\GitHub\PaddleX>  d:; cd 'd:\GitHub\PaddleX'; & 'D:\Program Files\Anaconda3\envs\paddle-cpu\python.exe' 'c:\Users\Lenovo\.vscode\extensions\ms-python.python-2021.5.842923320\pythonFiles\lib\python\debugpy\launcher' '57710' 
'--' 'd:\GitHub\PaddleX\test_cls.py' 
[WARNING] pycocotools is not installed, detection model is not available now.
[WARNING] pycocotools install: https://paddlex.readthedocs.io/zh_CN/develop/install.html#pycocotools
2021-05-16 09:11:32 [INFO]      Starting to read file list from dataset...
2021-05-16 09:11:32 [INFO]      1415 samples in file D:/Github/PaddleX/mini_imagenet_veg/train_list.txt
2021-05-16 09:11:32 [INFO]      Starting to read file list from dataset...
2021-05-16 09:11:32 [INFO]      404 samples in file D:/Github/PaddleX/mini_imagenet_veg/val_list.txt
!!! The CPU_NUM is not specified, you should set CPU_NUM in the environment variable list.
CPU_NUM indicates that how many CPUPlace are used in the current task.
And if this parameter are set as N (equal to the number of physical CPU core) the program may be faster.

export CPU_NUM=8 # for example, set CPU_NUM as number of physical CPU core which is 8.

!!! The default number of CPU_NUM=1.
D:\Program Files\Anaconda3\envs\paddle-cpu\lib\site-packages\paddle\fluid\layers\math_op_patch.py:298: UserWarning: d:\GitHub\PaddleX\paddlex\cv\nets\mobilenet_v3.py:231
The behavior of expression A * B has been unified with elementwise_mul(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_mul(X, Y, axis=0) instead of A * B. This transitional warning will be dropped in the future.
  op_type, op_type, EXPRESSION_MAP[method_name]))
2021-05-16 09:11:37,454 - INFO - If regularizer of a Parameter has been set by 'fluid.ParamAttr' or 'fluid.WeightNormParamAttr' already. The Regularization[L2Decay, regularization_coeff=0.000100] in Optimizer will not take effect, and it will only be applied to other Parameters!
2021-05-16 09:11:44 [INFO]      Load pretrain weights from D:/Github/PaddleX/output/mobilenetv3_large_ssld/pretrain/MobileNetV3_large_x1_0_ssld.
2021-05-16 09:11:44 [WARNING]   [SKIP] Shape of pretrained weight D:/Github/PaddleX/output/mobilenetv3_large_ssld/pretrain/MobileNetV3_large_x1_0_ssld/fc_weights doesn't match.(Pretrained: (1280, 1000), Actual: (1280, 14))
2021-05-16 09:11:44 [WARNING]   [SKIP] Shape of pretrained weight D:/Github/PaddleX/output/mobilenetv3_large_ssld/pretrain/MobileNetV3_large_x1_0_ssld/fc_offset doesn't match.(Pretrained: (1000,), Actual: (14,))
2021-05-16 09:11:45 [INFO]      There are 268 varaibles in D:/Github/PaddleX/output/mobilenetv3_large_ssld/pretrain/MobileNetV3_large_x1_0_ssld are loaded.
2021-05-16 09:11:49 [INFO]      [TRAIN] Epoch=1/12, Step=2/353, loss=2.854281, acc1=0.0, acc5=0.75, lr=0.00625, time_each_step=1.9s, eta=2:52:19
2021-05-16 09:11:51 [INFO]      [TRAIN] Epoch=1/12, Step=4/353, loss=2.374653, acc1=0.0, acc5=0.75, lr=0.00625, time_each_step=1.46s, eta=2:12:49
2021-05-16 09:11:53 [INFO]      [TRAIN] Epoch=1/12, Step=6/353, loss=2.469338, acc1=0.0, acc5=0.75, lr=0.00625, time_each_step=1.32s, eta=1:59:37
2021-05-16 09:11:55 [INFO]      [TRAIN] Epoch=1/12, Step=8/353, loss=2.510099, acc1=0.0, acc5=0.75, lr=0.00625, time_each_step=1.29s, eta=1:57:15
2021-05-16 09:11:58 [INFO]      [TRAIN] Epoch=1/12, Step=10/353, loss=2.647606, acc1=0.25, acc5=0.5, lr=0.00625, time_each_step=1.24s, eta=1:52:34
2021-05-16 09:12:00 [INFO]      [TRAIN] Epoch=1/12, Step=12/353, loss=2.402791, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.27s, eta=1:55:19
2021-05-16 09:12:04 [INFO]      [TRAIN] Epoch=1/12, Step=14/353, loss=2.322535, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.34s, eta=2:1:47
2021-05-16 09:12:06 [INFO]      [TRAIN] Epoch=1/12, Step=16/353, loss=2.249278, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.31s, eta=1:58:59
2021-05-16 09:12:08 [INFO]      [TRAIN] Epoch=1/12, Step=18/353, loss=2.370554, acc1=0.25, acc5=0.5, lr=0.00625, time_each_step=1.29s, eta=1:56:42
2021-05-16 09:12:11 [INFO]      [TRAIN] Epoch=1/12, Step=20/353, loss=2.677417, acc1=0.0, acc5=0.75, lr=0.00625, time_each_step=1.29s, eta=1:56:39
2021-05-16 09:12:13 [INFO]      [TRAIN] Epoch=1/12, Step=22/353, loss=1.542102, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=1.23s, eta=1:50:50
2021-05-16 09:12:16 [INFO]      [TRAIN] Epoch=1/12, Step=24/353, loss=2.171031, acc1=0.0, acc5=0.75, lr=0.00625, time_each_step=1.23s, eta=1:50:55
2021-05-16 09:12:18 [INFO]      [TRAIN] Epoch=1/12, Step=26/353, loss=2.086524, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.24s, eta=1:52:29
2021-05-16 09:12:20 [INFO]      [TRAIN] Epoch=1/12, Step=28/353, loss=2.82382, acc1=0.0, acc5=0.5, lr=0.00625, time_each_step=1.24s, eta=1:52:22
2021-05-16 09:12:23 [INFO]      [TRAIN] Epoch=1/12, Step=30/353, loss=1.089124, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=1.25s, eta=1:53:4
2021-05-16 09:12:25 [INFO]      [TRAIN] Epoch=1/12, Step=32/353, loss=2.089222, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.24s, eta=1:52:16
2021-05-16 09:12:28 [INFO]      [TRAIN] Epoch=1/12, Step=34/353, loss=0.977595, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=1.2s, eta=1:47:56
2021-05-16 09:12:30 [INFO]      [TRAIN] Epoch=1/12, Step=36/353, loss=0.736391, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=1.2s, eta=1:48:28
2021-05-16 09:12:32 [INFO]      [TRAIN] Epoch=1/12, Step=38/353, loss=2.456619, acc1=0.25, acc5=0.5, lr=0.00625, time_each_step=1.21s, eta=1:48:39
2021-05-16 09:12:35 [INFO]      [TRAIN] Epoch=1/12, Step=40/353, loss=2.214644, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.21s, eta=1:49:8
2021-05-16 09:12:37 [INFO]      [TRAIN] Epoch=1/12, Step=42/353, loss=1.265121, acc1=0.75, acc5=0.75, lr=0.00625, time_each_step=1.19s, eta=1:47:24
2021-05-16 09:12:39 [INFO]      [TRAIN] Epoch=1/12, Step=44/353, loss=1.806681, acc1=0.75, acc5=0.75, lr=0.00625, time_each_step=1.19s, eta=1:47:17
2021-05-16 09:12:42 [INFO]      [TRAIN] Epoch=1/12, Step=46/353, loss=2.469732, acc1=0.25, acc5=0.5, lr=0.00625, time_each_step=1.2s, eta=1:48:3
2021-05-16 09:12:44 [INFO]      [TRAIN] Epoch=1/12, Step=48/353, loss=2.398724, acc1=0.25, acc5=0.5, lr=0.00625, time_each_step=1.18s, eta=1:46:31
2021-05-16 09:12:46 [INFO]      [TRAIN] Epoch=1/12, Step=50/353, loss=1.038455, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=1.18s, eta=1:46:5
2021-05-16 09:12:49 [INFO]      [TRAIN] Epoch=1/12, Step=52/353, loss=1.15792, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=1.16s, eta=1:44:37
2021-05-16 09:12:51 [INFO]      [TRAIN] Epoch=1/12, Step=54/353, loss=2.202028, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.15s, eta=1:43:19
2021-05-16 09:12:53 [INFO]      [TRAIN] Epoch=1/12, Step=56/353, loss=2.102946, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.13s, eta=1:41:59
2021-05-16 09:12:55 [INFO]      [TRAIN] Epoch=1/12, Step=58/353, loss=1.518231, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.14s, eta=1:42:35
2021-05-16 09:12:58 [INFO]      [TRAIN] Epoch=1/12, Step=60/353, loss=3.948138, acc1=0.0, acc5=0.5, lr=0.00625, time_each_step=1.12s, eta=1:41:0
2021-05-16 09:13:00 [INFO]      [TRAIN] Epoch=1/12, Step=62/353, loss=0.833666, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=1.12s, eta=1:40:52
2021-05-16 09:13:02 [INFO]      [TRAIN] Epoch=1/12, Step=64/353, loss=3.764681, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.13s, eta=1:41:9
2021-05-16 09:13:04 [INFO]      [TRAIN] Epoch=1/12, Step=66/353, loss=0.783596, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=1.13s, eta=1:40:55
2021-05-16 09:13:07 [INFO]      [TRAIN] Epoch=1/12, Step=68/353, loss=0.407645, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=1.13s, eta=1:41:11
2021-05-16 09:13:09 [INFO]      [TRAIN] Epoch=1/12, Step=70/353, loss=1.321479, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=1.14s, eta=1:42:10
2021-05-16 09:13:11 [INFO]      [TRAIN] Epoch=1/12, Step=72/353, loss=1.884662, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=1.15s, eta=1:42:36
2021-05-16 09:13:14 [INFO]      [TRAIN] Epoch=1/12, Step=74/353, loss=1.547193, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.15s, eta=1:43:6
2021-05-16 09:13:16 [INFO]      [TRAIN] Epoch=1/12, Step=76/353, loss=2.837774, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.17s, eta=1:44:46
2021-05-16 09:13:18 [INFO]      [TRAIN] Epoch=1/12, Step=78/353, loss=3.100572, acc1=0.25, acc5=0.5, lr=0.00625, time_each_step=1.16s, eta=1:43:43
2021-05-16 09:13:21 [INFO]      [TRAIN] Epoch=1/12, Step=80/353, loss=1.691995, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.16s, eta=1:43:44
2021-05-16 09:13:23 [INFO]      [TRAIN] Epoch=1/12, Step=82/353, loss=2.67856, acc1=0.0, acc5=0.5, lr=0.00625, time_each_step=1.18s, eta=1:45:8
2021-05-16 09:13:25 [INFO]      [TRAIN] Epoch=1/12, Step=84/353, loss=1.867083, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.18s, eta=1:45:21
2021-05-16 09:13:28 [INFO]      [TRAIN] Epoch=1/12, Step=86/353, loss=3.427734, acc1=0.0, acc5=0.5, lr=0.00625, time_each_step=1.16s, eta=1:43:49
2021-05-16 09:13:30 [INFO]      [TRAIN] Epoch=1/12, Step=88/353, loss=2.039117, acc1=0.5, acc5=0.5, lr=0.00625, time_each_step=1.19s, eta=1:46:21
2021-05-16 09:13:33 [INFO]      [TRAIN] Epoch=1/12, Step=90/353, loss=0.790187, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=1.2s, eta=1:46:48
2021-05-16 09:13:35 [INFO]      [TRAIN] Epoch=1/12, Step=92/353, loss=1.866684, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.2s, eta=1:47:19
2021-05-16 09:13:38 [INFO]      [TRAIN] Epoch=1/12, Step=94/353, loss=2.570975, acc1=0.0, acc5=1.0, lr=0.00625, time_each_step=1.21s, eta=1:47:46
2021-05-16 09:13:40 [INFO]      [TRAIN] Epoch=1/12, Step=96/353, loss=1.942309, acc1=0.75, acc5=0.75, lr=0.00625, time_each_step=1.19s, eta=1:46:30
2021-05-16 09:13:42 [INFO]      [TRAIN] Epoch=1/12, Step=98/353, loss=1.324016, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.19s, eta=1:46:9
2021-05-16 09:13:44 [INFO]      [TRAIN] Epoch=1/12, Step=100/353, loss=1.650462, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.18s, eta=1:45:33
2021-05-16 09:13:47 [INFO]      [TRAIN] Epoch=1/12, Step=102/353, loss=1.935575, acc1=0.75, acc5=0.75, lr=0.00625, time_each_step=1.18s, eta=1:44:49
2021-05-16 09:13:49 [INFO]      [TRAIN] Epoch=1/12, Step=104/353, loss=3.037313, acc1=0.25, acc5=0.5, lr=0.00625, time_each_step=1.17s, eta=1:44:19
2021-05-16 09:13:51 [INFO]      [TRAIN] Epoch=1/12, Step=106/353, loss=2.385971, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.19s, eta=1:45:30
2021-05-16 09:13:54 [INFO]      [TRAIN] Epoch=1/12, Step=108/353, loss=1.77018, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=1.18s, eta=1:44:40
2021-05-16 09:13:56 [INFO]      [TRAIN] Epoch=1/12, Step=110/353, loss=1.83204, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.17s, eta=1:44:21
2021-05-16 09:13:59 [INFO]      [TRAIN] Epoch=1/12, Step=112/353, loss=1.314484, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=1.16s, eta=1:43:25
2021-05-16 09:14:01 [INFO]      [TRAIN] Epoch=1/12, Step=114/353, loss=1.342953, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=1.16s, eta=1:43:30
2021-05-16 09:14:03 [INFO]      [TRAIN] Epoch=1/12, Step=116/353, loss=1.129265, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=1.17s, eta=1:43:38
2021-05-16 09:14:06 [INFO]      [TRAIN] Epoch=1/12, Step=118/353, loss=0.669128, acc1=1.0, acc5=1.0, lr=0.00625, time_each_step=1.18s, eta=1:44:59
2021-05-16 09:14:08 [INFO]      [TRAIN] Epoch=1/12, Step=120/353, loss=2.727278, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.2s, eta=1:46:16
2021-05-16 09:14:11 [INFO]      [TRAIN] Epoch=1/12, Step=122/353, loss=1.26348, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.19s, eta=1:45:59
2021-05-16 09:14:13 [INFO]      [TRAIN] Epoch=1/12, Step=124/353, loss=0.927593, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=1.2s, eta=1:46:47
2021-05-16 09:14:16 [INFO]      [TRAIN] Epoch=1/12, Step=126/353, loss=0.765602, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=1.21s, eta=1:47:38
2021-05-16 09:14:18 [INFO]      [TRAIN] Epoch=1/12, Step=128/353, loss=1.824102, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=1.19s, eta=1:45:48
2021-05-16 09:14:20 [INFO]      [TRAIN] Epoch=1/12, Step=130/353, loss=0.967193, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=1.18s, eta=1:44:15
2021-05-16 09:14:23 [INFO]      [TRAIN] Epoch=1/12, Step=132/353, loss=1.021794, acc1=0.75, acc5=0.75, lr=0.00625, time_each_step=1.19s, eta=1:45:28
2021-05-16 09:14:25 [INFO]      [TRAIN] Epoch=1/12, Step=134/353, loss=3.997103, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.17s, eta=1:43:25
2021-05-16 09:14:27 [INFO]      [TRAIN] Epoch=1/12, Step=136/353, loss=2.469917, acc1=0.0, acc5=0.75, lr=0.00625, time_each_step=1.17s, eta=1:43:26
2021-05-16 09:14:29 [INFO]      [TRAIN] Epoch=1/12, Step=138/353, loss=1.903409, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=1.16s, eta=1:42:58
2021-05-16 09:14:33 [INFO]      [TRAIN] Epoch=1/12, Step=140/353, loss=1.482494, acc1=0.75, acc5=0.75, lr=0.00625, time_each_step=1.21s, eta=1:47:22
2021-05-16 09:14:36 [INFO]      [TRAIN] Epoch=1/12, Step=142/353, loss=1.117136, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=1.27s, eta=1:52:44
2021-05-16 09:14:40 [INFO]      [TRAIN] Epoch=1/12, Step=144/353, loss=1.636326, acc1=0.75, acc5=0.75, lr=0.00625, time_each_step=1.34s, eta=1:58:22
2021-05-16 09:14:43 [INFO]      [TRAIN] Epoch=1/12, Step=146/353, loss=0.516034, acc1=1.0, acc5=1.0, lr=0.00625, time_each_step=1.35s, eta=1:59:5
2021-05-16 09:14:45 [INFO]      [TRAIN] Epoch=1/12, Step=148/353, loss=1.30567, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=1.38s, eta=2:1:54
2021-05-16 09:14:48 [INFO]      [TRAIN] Epoch=1/12, Step=150/353, loss=1.20249, acc1=0.75, acc5=0.75, lr=0.00625, time_each_step=1.42s, eta=2:5:30
2021-05-16 09:14:51 [INFO]      [TRAIN] Epoch=1/12, Step=152/353, loss=1.942227, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.41s, eta=2:4:29
2021-05-16 09:14:53 [INFO]      [TRAIN] Epoch=1/12, Step=154/353, loss=2.582361, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.44s, eta=2:6:39
2021-05-16 09:14:56 [INFO]      [TRAIN] Epoch=1/12, Step=156/353, loss=2.001447, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.44s, eta=2:6:57
2021-05-16 09:14:58 [INFO]      [TRAIN] Epoch=1/12, Step=158/353, loss=0.919706, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=1.44s, eta=2:7:6
2021-05-16 09:15:02 [INFO]      [TRAIN] Epoch=1/12, Step=160/353, loss=1.741587, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.45s, eta=2:7:22
2021-05-16 09:15:05 [INFO]      [TRAIN] Epoch=1/12, Step=162/353, loss=1.30932, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=1.44s, eta=2:6:46
2021-05-16 09:15:09 [INFO]      [TRAIN] Epoch=1/12, Step=164/353, loss=0.93362, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=1.45s, eta=2:7:30
2021-05-16 09:15:13 [INFO]      [TRAIN] Epoch=1/12, Step=166/353, loss=2.220438, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.5s, eta=2:11:51
2021-05-16 09:15:16 [INFO]      [TRAIN] Epoch=1/12, Step=168/353, loss=2.027865, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.54s, eta=2:15:34
2021-05-16 09:15:20 [INFO]      [TRAIN] Epoch=1/12, Step=170/353, loss=1.695084, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.58s, eta=2:19:8
2021-05-16 09:15:24 [INFO]      [TRAIN] Epoch=1/12, Step=172/353, loss=0.739986, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=1.64s, eta=2:24:17
2021-05-16 09:15:27 [INFO]      [TRAIN] Epoch=1/12, Step=174/353, loss=1.918942, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.69s, eta=2:28:56
2021-05-16 09:15:31 [INFO]      [TRAIN] Epoch=1/12, Step=176/353, loss=1.77137, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.76s, eta=2:34:36
2021-05-16 09:15:33 [INFO]      [TRAIN] Epoch=1/12, Step=178/353, loss=4.139359, acc1=0.0, acc5=0.25, lr=0.00625, time_each_step=1.76s, eta=2:34:49
2021-05-16 09:15:36 [INFO]      [TRAIN] Epoch=1/12, Step=180/353, loss=1.181681, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=1.73s, eta=2:31:37
2021-05-16 09:15:38 [INFO]      [TRAIN] Epoch=1/12, Step=182/353, loss=1.700288, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=1.68s, eta=2:27:16
2021-05-16 09:15:41 [INFO]      [TRAIN] Epoch=1/12, Step=184/353, loss=1.422162, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=1.6s, eta=2:20:35
2021-05-16 09:15:43 [INFO]      [TRAIN] Epoch=1/12, Step=186/353, loss=1.968671, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.51s, eta=2:12:33
2021-05-16 09:15:45 [INFO]      [TRAIN] Epoch=1/12, Step=188/353, loss=1.064108, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=1.44s, eta=2:6:16
2021-05-16 09:15:47 [INFO]      [TRAIN] Epoch=1/12, Step=190/353, loss=0.743902, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=1.34s, eta=1:57:11
2021-05-16 09:15:48 [INFO]      [TRAIN] Epoch=1/12, Step=192/353, loss=2.114687, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.24s, eta=1:48:39
2021-05-16 09:15:50 [INFO]      [TRAIN] Epoch=1/12, Step=194/353, loss=1.597399, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=1.16s, eta=1:41:37
2021-05-16 09:15:52 [INFO]      [TRAIN] Epoch=1/12, Step=196/353, loss=2.023708, acc1=0.0, acc5=1.0, lr=0.00625, time_each_step=1.07s, eta=1:33:50
2021-05-16 09:15:54 [INFO]      [TRAIN] Epoch=1/12, Step=198/353, loss=1.309041, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=1.03s, eta=1:30:33
2021-05-16 09:15:56 [INFO]      [TRAIN] Epoch=1/12, Step=200/353, loss=1.82116, acc1=0.75, acc5=0.75, lr=0.00625, time_each_step=0.98s, eta=1:25:21
2021-05-16 09:15:58 [INFO]      [TRAIN] Epoch=1/12, Step=202/353, loss=1.773573, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=0.96s, eta=1:23:54
2021-05-16 09:15:59 [INFO]      [TRAIN] Epoch=1/12, Step=204/353, loss=0.896617, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=0.94s, eta=1:22:0
2021-05-16 09:16:01 [INFO]      [TRAIN] Epoch=1/12, Step=206/353, loss=0.91664, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=0.92s, eta=1:20:16
2021-05-16 09:16:03 [INFO]      [TRAIN] Epoch=1/12, Step=208/353, loss=2.332864, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=0.89s, eta=1:17:41
2021-05-16 09:16:05 [INFO]      [TRAIN] Epoch=1/12, Step=210/353, loss=2.051008, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=0.91s, eta=1:19:27
2021-05-16 09:16:07 [INFO]      [TRAIN] Epoch=1/12, Step=212/353, loss=1.725529, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=0.91s, eta=1:19:49
2021-05-16 09:16:08 [INFO]      [TRAIN] Epoch=1/12, Step=214/353, loss=0.773272, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=0.89s, eta=1:17:50
2021-05-16 09:16:10 [INFO]      [TRAIN] Epoch=1/12, Step=216/353, loss=1.20987, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=0.89s, eta=1:17:22
2021-05-16 09:16:12 [INFO]      [TRAIN] Epoch=1/12, Step=218/353, loss=0.657913, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=0.92s, eta=1:19:55
2021-05-16 09:16:14 [INFO]      [TRAIN] Epoch=1/12, Step=220/353, loss=1.61027, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=0.92s, eta=1:20:27
2021-05-16 09:16:16 [INFO]      [TRAIN] Epoch=1/12, Step=222/353, loss=1.33217, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=0.91s, eta=1:19:26
2021-05-16 09:16:18 [INFO]      [TRAIN] Epoch=1/12, Step=224/353, loss=5.076754, acc1=0.25, acc5=0.5, lr=0.00625, time_each_step=0.91s, eta=1:19:31
2021-05-16 09:16:20 [INFO]      [TRAIN] Epoch=1/12, Step=226/353, loss=1.705602, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=0.93s, eta=1:21:6
2021-05-16 09:16:22 [INFO]      [TRAIN] Epoch=1/12, Step=228/353, loss=1.145589, acc1=0.75, acc5=0.75, lr=0.00625, time_each_step=0.94s, eta=1:21:32
2021-05-16 09:16:23 [INFO]      [TRAIN] Epoch=1/12, Step=230/353, loss=1.834419, acc1=0.75, acc5=0.75, lr=0.00625, time_each_step=0.92s, eta=1:20:5
2021-05-16 09:16:25 [INFO]      [TRAIN] Epoch=1/12, Step=232/353, loss=2.26116, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=0.93s, eta=1:20:47
2021-05-16 09:16:27 [INFO]      [TRAIN] Epoch=1/12, Step=234/353, loss=1.571537, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=0.94s, eta=1:21:50
2021-05-16 09:16:29 [INFO]      [TRAIN] Epoch=1/12, Step=236/353, loss=1.614016, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=0.94s, eta=1:21:33
2021-05-16 09:16:31 [INFO]      [TRAIN] Epoch=1/12, Step=238/353, loss=1.588427, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=0.92s, eta=1:19:33
2021-05-16 09:16:33 [INFO]      [TRAIN] Epoch=1/12, Step=240/353, loss=1.978822, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=0.92s, eta=1:20:7
2021-05-16 09:16:34 [INFO]      [TRAIN] Epoch=1/12, Step=242/353, loss=1.979334, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=0.92s, eta=1:19:56
2021-05-16 09:16:36 [INFO]      [TRAIN] Epoch=1/12, Step=244/353, loss=2.183573, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=0.92s, eta=1:19:31
2021-05-16 09:16:38 [INFO]      [TRAIN] Epoch=1/12, Step=246/353, loss=1.05959, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=0.9s, eta=1:17:53
2021-05-16 09:16:40 [INFO]      [TRAIN] Epoch=1/12, Step=248/353, loss=3.459516, acc1=0.0, acc5=0.75, lr=0.00625, time_each_step=0.91s, eta=1:18:56
2021-05-16 09:16:42 [INFO]      [TRAIN] Epoch=1/12, Step=250/353, loss=1.30059, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=0.92s, eta=1:19:27
2021-05-16 09:16:43 [INFO]      [TRAIN] Epoch=1/12, Step=252/353, loss=0.773429, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=0.9s, eta=1:18:2
2021-05-16 09:16:45 [INFO]      [TRAIN] Epoch=1/12, Step=254/353, loss=1.002563, acc1=0.75, acc5=0.75, lr=0.00625, time_each_step=0.89s, eta=1:17:10
2021-05-16 09:16:47 [INFO]      [TRAIN] Epoch=1/12, Step=256/353, loss=1.685566, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=0.93s, eta=1:20:32
2021-05-16 09:16:49 [INFO]      [TRAIN] Epoch=1/12, Step=258/353, loss=1.398521, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=0.93s, eta=1:20:38
2021-05-16 09:16:51 [INFO]      [TRAIN] Epoch=1/12, Step=260/353, loss=0.523885, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=0.93s, eta=1:20:7
2021-05-16 09:16:53 [INFO]      [TRAIN] Epoch=1/12, Step=262/353, loss=2.196222, acc1=0.0, acc5=0.75, lr=0.00625, time_each_step=0.93s, eta=1:20:14
2021-05-16 09:16:55 [INFO]      [TRAIN] Epoch=1/12, Step=264/353, loss=1.228397, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=0.95s, eta=1:21:43
2021-05-16 09:16:57 [INFO]      [TRAIN] Epoch=1/12, Step=266/353, loss=2.630463, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=0.95s, eta=1:22:23
2021-05-16 09:16:58 [INFO]      [TRAIN] Epoch=1/12, Step=268/353, loss=4.231372, acc1=0.0, acc5=0.5, lr=0.00625, time_each_step=0.94s, eta=1:20:52
2021-05-16 09:17:00 [INFO]      [TRAIN] Epoch=1/12, Step=270/353, loss=1.928168, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=0.94s, eta=1:20:57
2021-05-16 09:17:02 [INFO]      [TRAIN] Epoch=1/12, Step=272/353, loss=1.451839, acc1=0.75, acc5=0.75, lr=0.00625, time_each_step=0.96s, eta=1:22:25
2021-05-16 09:17:04 [INFO]      [TRAIN] Epoch=1/12, Step=274/353, loss=0.801516, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=0.96s, eta=1:22:21
2021-05-16 09:17:06 [INFO]      [TRAIN] Epoch=1/12, Step=276/353, loss=1.397709, acc1=0.75, acc5=0.75, lr=0.00625, time_each_step=0.92s, eta=1:19:28
2021-05-16 09:17:08 [INFO]      [TRAIN] Epoch=1/12, Step=278/353, loss=0.476907, acc1=1.0, acc5=1.0, lr=0.00625, time_each_step=0.93s, eta=1:20:29
2021-05-16 09:17:10 [INFO]      [TRAIN] Epoch=1/12, Step=280/353, loss=0.798368, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=0.94s, eta=1:20:36
2021-05-16 09:17:12 [INFO]      [TRAIN] Epoch=1/12, Step=282/353, loss=2.545827, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=0.94s, eta=1:20:42
2021-05-16 09:17:13 [INFO]      [TRAIN] Epoch=1/12, Step=284/353, loss=2.421606, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=0.92s, eta=1:18:59
2021-05-16 09:17:16 [INFO]      [TRAIN] Epoch=1/12, Step=286/353, loss=0.748575, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=0.94s, eta=1:20:38
2021-05-16 09:17:18 [INFO]      [TRAIN] Epoch=1/12, Step=288/353, loss=1.859833, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=0.95s, eta=1:22:1
2021-05-16 09:17:19 [INFO]      [TRAIN] Epoch=1/12, Step=290/353, loss=1.259574, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=0.94s, eta=1:20:38
2021-05-16 09:17:21 [INFO]      [TRAIN] Epoch=1/12, Step=292/353, loss=1.833191, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=0.92s, eta=1:18:56
2021-05-16 09:17:23 [INFO]      [TRAIN] Epoch=1/12, Step=294/353, loss=2.611031, acc1=0.5, acc5=0.5, lr=0.00625, time_each_step=0.94s, eta=1:20:39
2021-05-16 09:17:25 [INFO]      [TRAIN] Epoch=1/12, Step=296/353, loss=1.931686, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=0.93s, eta=1:20:5
2021-05-16 09:17:26 [INFO]      [TRAIN] Epoch=1/12, Step=298/353, loss=1.589654, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=0.91s, eta=1:18:15
2021-05-16 09:17:28 [INFO]      [TRAIN] Epoch=1/12, Step=300/353, loss=0.872491, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=0.9s, eta=1:17:16
2021-05-16 09:17:30 [INFO]      [TRAIN] Epoch=1/12, Step=302/353, loss=3.616052, acc1=0.0, acc5=0.75, lr=0.00625, time_each_step=0.91s, eta=1:18:7
2021-05-16 09:17:32 [INFO]      [TRAIN] Epoch=1/12, Step=304/353, loss=2.109635, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=0.91s, eta=1:18:15
2021-05-16 09:17:33 [INFO]      [TRAIN] Epoch=1/12, Step=306/353, loss=1.459583, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=0.88s, eta=1:15:44
2021-05-16 09:17:35 [INFO]      [TRAIN] Epoch=1/12, Step=308/353, loss=1.034375, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=0.86s, eta=1:14:3
2021-05-16 09:17:37 [INFO]      [TRAIN] Epoch=1/12, Step=310/353, loss=1.403825, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=0.91s, eta=1:18:10
2021-05-16 09:17:39 [INFO]      [TRAIN] Epoch=1/12, Step=312/353, loss=0.331182, acc1=1.0, acc5=1.0, lr=0.00625, time_each_step=0.93s, eta=1:19:15
2021-05-16 09:17:41 [INFO]      [TRAIN] Epoch=1/12, Step=314/353, loss=0.50653, acc1=1.0, acc5=1.0, lr=0.00625, time_each_step=0.91s, eta=1:17:38
2021-05-16 09:17:43 [INFO]      [TRAIN] Epoch=1/12, Step=316/353, loss=1.474663, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=0.92s, eta=1:18:54
2021-05-16 09:17:45 [INFO]      [TRAIN] Epoch=1/12, Step=318/353, loss=2.527382, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=0.94s, eta=1:20:1
2021-05-16 09:17:46 [INFO]      [TRAIN] Epoch=1/12, Step=320/353, loss=1.289207, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=0.93s, eta=1:19:38
2021-05-16 09:17:48 [INFO]      [TRAIN] Epoch=1/12, Step=322/353, loss=1.710762, acc1=0.75, acc5=0.75, lr=0.00625, time_each_step=0.92s, eta=1:18:24
2021-05-16 09:17:50 [INFO]      [TRAIN] Epoch=1/12, Step=324/353, loss=1.997516, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=0.92s, eta=1:18:22
2021-05-16 09:17:52 [INFO]      [TRAIN] Epoch=1/12, Step=326/353, loss=0.498283, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=0.94s, eta=1:20:25
2021-05-16 09:17:54 [INFO]      [TRAIN] Epoch=1/12, Step=328/353, loss=0.362424, acc1=1.0, acc5=1.0, lr=0.00625, time_each_step=0.96s, eta=1:22:1
2021-05-16 09:17:56 [INFO]      [TRAIN] Epoch=1/12, Step=330/353, loss=1.534446, acc1=0.25, acc5=1.0, lr=0.00625, time_each_step=0.92s, eta=1:18:25
2021-05-16 09:17:58 [INFO]      [TRAIN] Epoch=1/12, Step=332/353, loss=3.424131, acc1=0.0, acc5=0.25, lr=0.00625, time_each_step=0.93s, eta=1:19:32
2021-05-16 09:18:00 [INFO]      [TRAIN] Epoch=1/12, Step=334/353, loss=1.058479, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=0.94s, eta=1:20:6
2021-05-16 09:18:01 [INFO]      [TRAIN] Epoch=1/12, Step=336/353, loss=2.292683, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=0.93s, eta=1:18:51
2021-05-16 09:18:03 [INFO]      [TRAIN] Epoch=1/12, Step=338/353, loss=1.30996, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=0.91s, eta=1:17:25
2021-05-16 09:18:05 [INFO]      [TRAIN] Epoch=1/12, Step=340/353, loss=1.232938, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=0.94s, eta=1:19:49
2021-05-16 09:18:07 [INFO]      [TRAIN] Epoch=1/12, Step=342/353, loss=1.705322, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=0.93s, eta=1:19:25
2021-05-16 09:18:09 [INFO]      [TRAIN] Epoch=1/12, Step=344/353, loss=2.254051, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=0.96s, eta=1:21:28
2021-05-16 09:18:11 [INFO]      [TRAIN] Epoch=1/12, Step=346/353, loss=1.236704, acc1=0.5, acc5=1.0, lr=0.00625, time_each_step=0.96s, eta=1:21:52
2021-05-16 09:18:14 [INFO]      [TRAIN] Epoch=1/12, Step=348/353, loss=2.448665, acc1=0.25, acc5=0.75, lr=0.00625, time_each_step=0.98s, eta=1:23:13
2021-05-16 09:18:16 [INFO]      [TRAIN] Epoch=1/12, Step=350/353, loss=0.786336, acc1=0.75, acc5=1.0, lr=0.00625, time_each_step=1.0s, eta=1:24:53
2021-05-16 09:18:18 [INFO]      [TRAIN] Epoch=1/12, Step=352/353, loss=2.906855, acc1=0.5, acc5=0.75, lr=0.00625, time_each_step=1.0s, eta=1:24:36
PS D:\GitHub\PaddleX> 
FlyingQianMM commented 3 years ago

请问本地显卡的型号、cuda/cudnn版本,安装的paddle版本是多少呢?

chccc1994 commented 3 years ago
  1. CPU Paddle 2.0.2 ,更换模型也是一样也是中断。

2 .更换GPU版本 CUDA 11.0 CUDNN 8.0.5 Paddlepaddle-gpu 2.1.0.post110 也是一样也是中断。

CPU 和GPU 两种模式都不行。

fangxin-debug commented 3 years ago

您好,问题解决了没?我也是这个问题

libingbingdev commented 3 years ago

把scipy版本降到1.3.1试试

lishuai903 commented 3 years ago

@fangxin-debug @chccc1994 解决了吗?等大佬回复。