PaddlePaddle / PaddleDetection

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Apache License 2.0
12.38k stars 2.84k forks source link

训练时的epoch少于设置的epoch #8032

Open wsy-yjys opened 1 year ago

wsy-yjys commented 1 year ago

问题确认 Search before asking

Bug组件 Bug Component

No response

Bug描述 Describe the Bug

我参照ppyoloe_plus_crn_t_auxhead_300e_cocoyml,自建了一个ppyoloe_plus_crn_t_auxhead_300e_gdut.yml,具体内容如下:

_BASE_: [
  '../datasets/GDUT.yml',
  '../runtime.yml',
  './_base_/optimizer_300e.yml',
  './_base_/ppyoloe_plus_crn_tiny_auxhead.yml',
  './_base_/ppyoloe_plus_reader.yml',
]

epoch: 300

LearningRate:
  base_lr: 0.0025     # base_lr=0.01, 原来8gpu*every_gpu_8,now all_batch=16,
  schedulers:
    - name: CosineDecay
      max_epochs: 360
    - name: LinearWarmup
      start_factor: 0.
      epochs: 5

log_iter: 100
snapshot_epoch: 10     # every epoch save weight
weights: output/ppyoloe_plus_crn_t_auxhead_300e_gdut/model_final

TrainReader:
  batch_size: 4         # 4gpu,per 4 imgs
EvalReader:
  batch_size: 2

# pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_t_pretrained.pdparams
depth_mult: 0.33
width_mult: 0.375

其中设置的epoch是300,但是训练时打印的日志信息如下:

[04/03 17:07:42] ppdet.engine INFO: Epoch: [0] [ 0/99] learning_rate: 0.000000 loss: 8.639099 loss_cls: 3.119581 loss_iou: 1.255225 loss_dfl: 4.762912 loss_l1: 11.159260 eta: 1 day, 12:28:33 batch_cost: 4.4213 data_cost: 0.0005 ips: 0.9047 images/s

Epoch: [0] [ 0/99]显示epoch只有100轮,如何解决这个问题,求助

复现环境 Environment

Bug描述确认 Bug description confirmation

是否愿意提交PR? Are you willing to submit a PR?

nemonameless commented 1 year ago

Epoch: [0] [ 0/99] 是第0个epoch,还远远没训到你设置的300epoch。99表示每个epoch的iters数。