PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
42.74k stars 7.68k forks source link

关于EAST模型,训练阶段 验证用时 过长的问题 #3310

Closed tairen99 closed 3 years ago

tairen99 commented 3 years ago

Hi,PaddleOCR 和其他同仁, 首先谢谢你们提供这么好的平台。

在使用你们的 EAST模型,在AWS g4dn.2xlarge (8CPUs, 32GB memory, 16GB GPU memory ) 作我们自己的文字检测训练的时候,我遇到了 验证阶段用时过长的问题(验证了4张图片,用了大概6个小时):

我使用的PaddleOCR版本是: PaddleOCR {release/2.0} 我的EAST 训练用的 配置文件如下:

Global:
  use_gpu: true
  epoch_num: 2000
  log_smooth_window: 20
  print_batch_step: 2
  save_model_dir: ./output/east_r50_vd/
  save_epoch_step: 100
  # evaluation is run every 5000 iterations after the 4000th iteration
  eval_batch_step: [0, 500]
  # if pretrained_model is saved in static mode, load_static_weights must set to True
  load_static_weights: True
  cal_metric_during_train: False
  pretrained_model: ./pretrain_models/ResNet50_vd_pretrained/
  checkpoints:
  save_inference_dir:
  use_visualdl: False
  infer_img:
  save_res_path: ./output/det_east/predicts_east.txt

Architecture:
  model_type: det
  algorithm: EAST
  Transform:
  Backbone:
    name: ResNet
    layers: 50
  Neck:
    name: EASTFPN
    model_name: large
  Head:
    name: EASTHead
    model_name: large

Loss:
  name: EASTLoss

Optimizer:
  name: Adam
  beta1: 0.9
  beta2: 0.999
  lr:
  #  name: Cosine
    learning_rate: 0.001
  #  warmup_epoch: 0
  regularizer:
    name: 'L2'
    factor: 0

PostProcess:
  name: EASTPostProcess
  score_thresh: 0.8
  cover_thresh: 0.1
  nms_thresh: 0.2

Metric:
  name: DetMetric
  main_indicator: hmean

Train:
  dataset:
    name: SimpleDataSet
    data_dir: ./train_data/paddle_train/text_localization/
    label_file_list:
      - ./train_data/paddle_train/text_localization/train_tag.txt
    ratio_list: [1.0]
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - DetLabelEncode: # Class handling label
      - EASTProcessTrain:
          image_shape: [512, 512]
          background_ratio: 0.125
          min_crop_side_ratio: 0.1
          min_text_size: 10
      - KeepKeys:
          keep_keys: ['image', 'score_map', 'geo_map', 'training_mask'] # dataloader will return list in this order
  loader:
    shuffle: True
    drop_last: False
    batch_size_per_card: 8
    num_workers: 0

Eval:
  dataset:
    name: SimpleDataSet
    data_dir: ./train_data/paddle_train/text_localization/
    label_file_list:
      - ./train_data/paddle_train/text_localization/test_tag.txt
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - DetLabelEncode: # Class handling label
      - DetResizeForTest:
          limit_side_len: 2400
          limit_type: max
      - NormalizeImage:
          scale: 1./255.
          mean: [0.485, 0.456, 0.406]
          std: [0.229, 0.224, 0.225]
          order: 'hwc'
      - ToCHWImage:
      - KeepKeys:
          keep_keys: ['image', 'shape', 'polys', 'ignore_tags']
  loader:
    shuffle: False
    drop_last: False
    batch_size_per_card: 1 # must be 1
    num_workers: 2

验证过程截图: image

使用资源截图: image

通过在本github搜索这个问题,我发现了这个链接:https://github.com/PaddlePaddle/PaddleOCR/issues/196, 当我尝试在“east_postprocess.py” 替换 “nms_locality”函数 为 “lanms.merge_quadrangle_n9”,我看你们code里面是:

    if self.is_python35:
        import lanms
        boxes = lanms.merge_quadrangle_n9(boxes, nms_thresh)
    else:
        boxes = nms_locality(boxes.astype(np.float64), nms_thresh)

是不是意味着 这个用C++加速的函数,只能在 Python3.5的环境中使用? 如果是这样,是不是与快速安装中要求的Python3.7 有冲突? https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/installation_en.md

我尝试安装用 pip3.5, pip3.7和 pip3 安装 lanms 包,都没有成功: image

image

我在使用SAST 模型进行文字检测的时候,用时很长,似乎也是因为这个nms_locality 的问题。

现在我不知道该怎么样解决这个问题如果我想继续使用PPOCR,还请各位多多分享和指点! 先表示感谢!

MissPenguin commented 3 years ago

不需要pip install lanms,代码检测到python版本是3.5的时候自动就会走这个lanms的分支的。这部分加速版nms代码是从外部项目引入的,由于原作只有py35的实现,所以简单的写了这个分支,有需要的话,可以debug一下这部分代码,使其兼容更高版本的python。

tairen99 commented 3 years ago

Hi,@MissPenguin, 谢谢你的回复。 我尝试直接使用python3.5 的版本,但是出现了import error 如下图所示:

image

所以即使我直接使用python3.5也似乎不能绕过这个 版本的问题,不知道你们还有别的办法去加速训练 而不需要去修改 lanms 源代码? 谢谢!

abdksyed commented 3 years ago

@tairen99 Hi, I too faced the same issue, the NMS is working on 30k boxes so taking a lot of time. I see that you closed this issue, can I know how did you solved it.