PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.28k stars 7.82k forks source link

east网络验证时耗时太久 #196

Closed NextGuido closed 4 years ago

NextGuido commented 4 years ago

我在运行east网络进行icdar文本检测的时候,验证阶段的一直卡在这个界面,现在都快20分钟了,重新尝试了好多次都是这样,这是为什么啊? 选区_001

训练的配置文件 configs/det/det_r50_vd_east.yml 如下:

Global:
  algorithm: EAST
  use_gpu: true
  epoch_num: 100000
  log_smooth_window: 20
  print_batch_step: 5
  save_model_dir: ./output/det_east/
  save_epoch_step: 200
  eval_batch_step: 20
  train_batch_size_per_card: 48
  test_batch_size_per_card: 16
  image_shape: [3, 512, 512]
  reader_yml: ./configs/det/det_east_icdar15_reader.yml
  pretrain_weights: ./pretrain_models/ResNet50_vd_ssld_pretrained/
  save_res_path: ./output/det_east/predicts_east.txt
  checkpoints:
  save_inference_dir:

Architecture:
  function: ppocr.modeling.architectures.det_model,DetModel

Backbone:
  function: ppocr.modeling.backbones.det_resnet_vd,ResNet
  layers: 50

Head:
  function: ppocr.modeling.heads.det_east_head,EASTHead
  model_name: large

Loss:
  function: ppocr.modeling.losses.det_east_loss,EASTLoss

Optimizer:
  function: ppocr.optimizer,AdamDecay
  base_lr: 0.001
  beta1: 0.9
  beta2: 0.999

PostProcess:
  function: ppocr.postprocess.east_postprocess,EASTPostPocess
  score_thresh: 0.8
  cover_thresh: 0.1
  nms_thresh: 0.2
MissPenguin commented 4 years ago

这是因为east后处理中的locality_aware_nms耗时较大,目前代码中使用的速度较慢的python版,可以参考east源码,替换成速度更快的C++版:https://github.com/argman/EAST/blob/dca414de39a3a4915a019c9a02c1832a31cdd0ca/eval.py#L100

NextGuido commented 4 years ago

@MissPenguin 不好意思,回复晚了。我想问一下,用现在的这个locality_aware_nms去预测的话,是不是耗时也很厉害?我用ai studio尝试预测,平均一张图片2s。是我预测有问题吗?下面是我的预测代码,因为每张测试图片的大小不一致(像素大小从几百到2000以内),所以设置test_batch_size_per_card=1: python3 tools/infer_det.py -c configs/det/det_r50_vd_east.yml -o Global.checkpoints="./output/det_east/iter_epoch_298" TestReader.infe r_img="/home/aistudio/data/data18344/icpr_mtwi_task2/image_test/"

MissPenguin commented 4 years ago

预测应该没有问题,现在代码里面的locality_aware_nms确实比较慢,图片大一点的话,平均2s也是有可能的,c++版的locality_aware_nms比代码中的python版的locality_aware_nms能加速几十倍,可以自己替换一下,挺好替换的。另外,如果对速度有要求,可以选择使用DB,比east快不少。

NextGuido commented 4 years ago

@MissPenguin 感谢答疑。最后,我们这个有类似tensorboard的图像可视化支持吗?因为看log日志看起来确实不太方便

MissPenguin commented 4 years ago

有的,可以用VisualDL,使用方式参考文档:https://paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/evaluation_debugging/debug/visualdl_usage.html

NextGuido commented 4 years ago

@MissPenguin 非常感谢您的帮助,果然用了C++版本的locality_aware_nms,速度快到飞起,很棒的项目。希望以后PaddleOCR的项目里面可以自带集成VisualDL

tairen99 commented 3 years ago

@NextGuido @MissPenguin,你好!

我现在也碰到了同样的问题,EAST网络在 验证阶段,用时间太久,5个小时才验证了3张图片。

可否请教一下你们是怎么样替换 locality_aware_nms 成 lanms.merge_quadrangle_n9 ? 使用 Python3.5 的安装环境吗? 我在 Python3.7 的安装环境中,安装 lanms会报错如下:

Collecting lanms Using cached lanms-1.0.2.tar.gz (973 kB) ERROR: Command errored out with exit status 1: command: /usr/bin/python3.7 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-q8ojgoyq/lanms_fc4fbac9a9e145aea0b8ff5b07882526/setup.py'"'"'; file='"'"'/tmp/pip-install-q8ojgoyq/lanms_fc4fbac9a9e145aea0b8ff5b07882526/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-ekcz5qzv cwd: /tmp/pip-install-q8ojgoyq/lanms_fc4fbac9a9e145aea0b8ff5b07882526/ Complete output (21 lines): make: Entering directory '/tmp/pip-install-q8ojgoyq/lanms_fc4fbac9a9e145aea0b8ff5b07882526' make: python3-config: Command not found make: python3-config: Command not found g++ -o lanms/adaptor.so -I include -std=c++11 -O3 adaptor.cpp include/clipper/clipper.cpp --shared -fPIC In file included from include/pybind11/pytypes.h:12, from include/pybind11/cast.h:13, from include/pybind11/attr.h:13, from include/pybind11/pybind11.h:43, from adaptor.cpp:1: include/pybind11/common.h:100:10: fatal error: Python.h: No such file or directory

include

          ^~~~~~~~~~
compilation terminated.
Makefile:10: recipe for target 'lanms/adaptor.so' failed
make: *** [lanms/adaptor.so] Error 1
make: Leaving directory '/tmp/pip-install-q8ojgoyq/lanms_fc4fbac9a9e145aea0b8ff5b07882526'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/tmp/pip-install-q8ojgoyq/lanms_fc4fbac9a9e145aea0b8ff5b07882526/setup.py", line 28, in <module>
    raise RuntimeError('Cannot compile lanms in the directory: {}'.format(BASE_DIR))
RuntimeError: Cannot compile lanms in the directory: /tmp/pip-install-q8ojgoyq/lanms_fc4fbac9a9e145aea0b8ff5b07882526

不知道怎么解决这个问题。 还想请你分享,谢谢!