TinyQi commented 6 months ago

请问，压缩V4检测 server模型，需要使用什么数据集？

我使用自有的数据，验证精度不佳。我想请问我可以使用什么数据集进行验证，能尽可能保持模型的精度呢?

ceci3 commented 6 months ago

一般使用训练数据数据就可以，另外请问下压缩方法使用的是什么？

TinyQi commented 6 months ago

一般使用训练数据数据就可以，另外请问下压缩方法使用的是什么？

我用我自己的训练集，大概只有1000张左右，压缩之后模型的精度很差。前几个iter的时候还至少有一点精度的，如下图到后面就变成这样了

TinyQi commented 6 months ago

一般使用训练数据数据就可以，另外请问下压缩方法使用的是什么？

我使用的压缩方法我也不太清楚，就是按照你们的文档，简单修改了数据集和模型的地址，学习率我也按照batch_size和GPU卡数进行了调整。下面是我的配置文件，麻烦您看看是不是有什么问题： Global: model_type: det model_dir: /share/disk3/xcq/02.model_cache/pretrain_models/ch_PP-OCRv4_server_det_guding_shuru_1_output_bak/

固定输出

model_filename: inference.pdmodel params_filename: inference.pdiparams algorithm: DB

Distillation: alpha: 1.0 loss: l2

QuantAware: use_pact: true activation_bits: 8 is_full_quantize: false onnx_format: True activation_quantize_type: moving_average_abs_max weight_quantize_type: channel_wise_abs_max not_quant_pattern:

skip_quant quantize_op_types:
conv2d
depthwise_conv2d weight_bits: 8

TrainConfig: epochs: 5 eval_iter: 200 learning_rate: type: CosineAnnealingDecay learning_rate: 0.00000625 optimizer_builder: optimizer: type: Adam weight_decay: 5.0e-05

PostProcess: name: DBPostProcess thresh: 0.3 box_thresh: 0.6 max_candidates: 1000 unclip_ratio: 1.5

Metric: name: DetMetric main_indicator: hmean

Train: dataset: name: SimpleDataSet data_dir: / label_file_list:

/share/disk3/xcq/01.ImageData/028.LocationCharacterRecognition/train/ready_to_train/2023-11-29det/train.txt ratio_list: [1.0] transforms:
- DecodeImage: img_mode: BGR channel_first: false
- DetLabelEncode: null
- IaaAugment: augmenter_args:
  - type: Fliplr args: p: 0.5
  - type: Affine args: rotate:
    - -10
    - 10
  - type: Resize args: size:
    - 0.5
    - 3
- EastRandomCropData: size:
  - 960
  - 960 max_tries: 50 keep_ratio: true
- MakeBorderMap: shrink_ratio: 0.4 thresh_min: 0.3 thresh_max: 0.7
- MakeShrinkMap: shrink_ratio: 0.4 min_text_size: 8
- NormalizeImage: scale: 1./255. mean:
  - 0.485
  - 0.456
  - 0.406 std:
  - 0.229
  - 0.224
  - 0.225 order: hwc
- ToCHWImage: null
- KeepKeys: keep_keys:
  - image
  - threshold_map
  - threshold_mask
  - shrink_map
  - shrink_mask loader: shuffle: true drop_last: false batch_size_per_card: 1 num_workers: 0

Eval: dataset: name: SimpleDataSet data_dir: / label_file_list:

/share/disk3/xcq/01.ImageData/028.LocationCharacterRecognition/train/ready_to_train/2023-11-29det/test.txt transforms:
- DecodeImage: img_mode: BGR channel_first: false
- DetLabelEncode: null
- DetResizeForTest:
  limit_side_len: 960
  
  limit_type: 'max'
  
  image_shape: [960,960] keep_ratio: false
- NormalizeImage: scale: 1./255. mean:
  - 0.485
  - 0.456
  - 0.406 std:
  - 0.229
  - 0.224
  - 0.225 order: hwc
- ToCHWImage: null
- KeepKeys: keep_keys:
  - image
  - shape
  - polys
  - ignore_tags loader: shuffle: false drop_last: false batch_size_per_card: 1 num_workers: 0

TinyQi commented 6 months ago

另外有3个情况跟您反映一下，可能可以方便您排查问题。

我使用的模型是固定输入的，我在导出V4的开源静态模型时，我定死了输入尺寸，导出的模型是固定输入的（shape:[1,3,960,960]
我将模型的输出从原来的2个，改成了只有一个输出（只保留了sigmoid_11.tmp_0这个输出）。因为我一开始直接使用V4模型进行量化训练的时候，会报错。报错信息如下，因为我对比了示例中V3模型的模型，发现V3模型只有一个输出，为此我就尝试只保留一个输出，实验证明如果只保留一个输出的话，就不会崩溃，但是模型量化结果不尽人意。 Traceback (most recent call last): File "run.py", line 157, in main() File "run.py", line 150, in main ac.compress() File "/home/anaconda3/envs/paddle_2.4.1_gpu/lib/python3.7/site-packages/paddleslim/auto_compression/compressor.py", line 594, in compress train_config) File "/home/anaconda3/envs/paddle_2.4.1_gpu/lib/python3.7/site-packages/paddleslim/auto_compression/compressor.py", line 776, in single_strategy_compress train_program_info, test_program_info, strategy, train_config) File "/home/anaconda3/envs/paddle_2.4.1_gpu/lib/python3.7/site-packages/paddleslim/auto_compression/compressor.py", line 825, in _start_train test_program_info.fetch_targets) File "run.py", line 80, in eval_function fetch_list=test_fetch_list) ValueError: too many values to unpack (expected 1) img.shape:(960, 960, 3)
当前情况我使用的是V4的开源检测模型，在进行正常的微调训练时，我之前也尝试过使用我上面提到的自有的数据集进行微调，但是微调的结果都不如原始的开源模型，为此这个也是我怀疑数据集的问题的主要原因。

望有助您排查问题。

TinyQi commented 6 months ago

量化训练脚本为：PaddleSlim/example/auto_compression/ocr/run.py

ceci3 commented 6 months ago

请问下V4模型是什么模型，具体我怎么拿到模型？

huangguifeng commented 4 months ago

我也遇到一样的问题

PaddlePaddle / PaddleSlim

请问，压缩V4检测 server模型，需要使用什么数据集？ #1873

固定输出

limit_side_len: 960

limit_type: 'max'