Closed Ycxyue closed 1 year ago
请问当我跑完了第一轮开始验证时,报错显存不够,但是有保存到第一轮的训练模型,是在训练完第一轮后加载验证集时产生的错误;验证时的batch_size=1;mmocr==0.2.0;mmdet==2.16.0;mmcv-full==1.3.12; 如果将验证设置为false,则可以跑完全部的17个epoch;
Could you show the complete error report ?
非常感谢!
我是先训练表结构提取的部分,具体错误如下:
"""
2021-09-06 06:42:22,383 - mmocr - INFO - Epoch [1][13/13] lr: 4.933e-04, eta: 0:02:28, time: 0.457, data_time: 0.005,
memory: 12909, loss_ce: 3.1415, horizon_bbox_loss: 0.4799, vertical_bbox_loss: 0.6473, loss: 4.2686, grad_norm: 58.6805
2021-09-06 06:42:22,418 - mmocr - INFO - Saving checkpoint at 1 epochs
[ ] 0/99, elapsed: 0s, ETA:Traceback (most recent call last):
File "./tools/train.py", line 228, in
I faced the same issue. Can someone help me?
请问当我跑完了第一轮开始验证时,报错显存不够,但是有保存到第一轮的训练模型,是在训练完第一轮后加载验证集时产生的错误;验证时的batch_size=1;mmocr==0.2.0;mmdet==2.16.0;mmcv-full==1.3.12; 如果将验证设置为false,则可以跑完全部的17个epoch;