picodet量化训练一些epoch后mAP突变为0，loss开始突变大

PaddlePaddle / PaddleDetection

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.

Apache License 2.0

12.61k stars 2.86k forks source link

picodet量化训练一些epoch后mAP突变为0，loss开始突变大 #4010

Closed youngstu closed 2 years ago

youngstu commented 3 years ago

picodet量化训练一些epoch后mAP突变为0，loss开始突变大

lyuwenyu commented 3 years ago

这信息不够多啊 1.是不是lr太大 2.检查数据有没有异常的 3. 加clip norm稳定一下？

youngstu commented 3 years ago

slim-config:
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ShuffleNetV2_x1_0_pretrained.pdparams
slim: QAT 

QAT:
  quant_config: {
    'activation_preprocess_type': 'PACT',
    'weight_quantize_type': 'channel_wise_abs_max', 'activation_quantize_type': 'moving_average_abs_max',
    'weight_bits': 8, 'activation_bits': 8, 'dtype': 'int8', 'window_size': 10000, 'moving_rate': 0.9,
    'quantizable_layer_type': ['Conv2D', 'Linear']}
  print_model: False

epoch: 28
LearningRate:
  base_lr: 0.01
  schedulers:
  - !PiecewiseDecay
    gamma: 0.1 
    milestones:
    - 18
    - 25
  - !LinearWarmup
    start_factor: 0.
    steps: 1000

Screenshot from 2021-08-18 21-21-04

前几个epoch 收敛很好，和float32模型基本上性能都是对齐的，但是从第7个epoch就逐渐学偏了

mzxhzhp commented 3 years ago

picodet量化训练一些epoch后mAP突变为0，loss开始突变大

我也遇到和你一样的问题了，量化训练的时候，本来loss是一点几，然后过了一段时间我再看，就突然变成一百多了，无语

yghstill commented 2 years ago

@mzxhzhp @youngstu 单卡训练的话建议将学习率调的更小

paddle-bot-old[bot] commented 2 years ago

Since this issue has not been updated for more than three months, it will be closed, if it is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up. It is recommended to pull and try the latest code first. 由于该问题超过三个月未更新，将会被关闭，若问题未解决或有后续问题，请随时重新打开（建议先拉取最新代码进行尝试），我们会继续跟进。