PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
43.91k stars 7.8k forks source link

/home/PaddleOCR/ppocr/utils/ppocr_keys_v1.txt读取出错 #864

Closed moronism189 closed 4 years ago

moronism189 commented 4 years ago

λ fa83723faf1f /tf/ocr python3 test4PaddleOcr.py grep: warning: GREP_OPTIONS is deprecated; please use an alias or script Namespace(cls=False, cls_batch_num=30, cls_image_shape='3, 48, 192', cls_model_dir='/root/.paddleocr/cls', cls_thresh=0.9, det=True, det_algorithm='DB', det_db_box_thresh=0.5, det_db_thresh=0.3, det_db_unclip_ratio=2.0, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_max_side_len=960, det_model_dir='/root/.paddleocr/det', enable_mkldnn=False, gpu_mem=2000, image_dir=None, ir_optim=True, label_list=['0', '180'], lang='ch', max_text_length=25, rec=True, rec_algorithm='CRNN', rec_batch_num=30, rec_char_dict_path='./ppocr/utils/ppocr_keys_v1.txt', rec_char_type='ch', rec_image_shape='3, 32, 320', rec_model_dir='/root/.paddleocr/rec/ch', use_angle_cls=False, use_gpu=True, use_space_char=True, use_tensorrt=False, use_zero_copy_run=False) character_dict_path: /home/PaddleOCR/ppocr/utils/ppocr_keys_v1.txt Traceback (most recent call last): File "test4PaddleOcr.py", line 8, in ocr = PaddleOCR(gpu_mem=2000) # need to run only once to download and load model into memory File "/home/PaddleOCR/paddleocr.py", line 222, in init super().init(postprocess_params) File "/home/PaddleOCR/tools/infer/predict_system.py", line 42, in init self.text_recognizer = predict_rec.TextRecognizer(args) File "/home/PaddleOCR/tools/infer/predict_rec.py", line 61, in init self.char_ops = CharacterOps(char_ops_params) File "/home/PaddleOCR/ppocr/utils/character.py", line 47, in init with open(character_dict_path, "rb") as fin: TypeError: invalid file: PosixPath('/home/PaddleOCR/ppocr/utils/ppocr_keys_v1.txt')

测试程序test4PaddleOcr.py如下: from paddleocr import PaddleOCR ocr = PaddleOCR(gpu_mem=2000) # need to run only once to download and load model into memory img_path = 'image/brand.jpg' result = ocr.ocr(img_path, det=False) for line in result: print(line)

WenmuZhou commented 4 years ago

升级一下paddleocr

moronism189 commented 4 years ago

升级一下paddleocr

tar了最新的1.1.0……

λ 337789eb7aed /tf/ocr python3 test4PaddleOcr.py grep: warning: GREP_OPTIONS is deprecated; please use an alias or script Namespace(cls=False, cls_batch_num=30, cls_image_shape='3, 48, 192', cls_model_dir='/root/.paddleocr/cls', cls_thresh=0.9, det=True, det_algorithm='DB', det_db_box_thresh=0.5, det_db_thresh=0.3, det_db_unclip_ratio=2.0, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_max_side_len=960, det_model_dir='/root/.paddleocr/det', enable_mkldnn=False, gpu_mem=2000, image_dir=None, ir_optim=True, label_list=['0', '180'], lang='ch', max_text_length=25, rec=True, rec_algorithm='CRNN', rec_batch_num=30, rec_char_dict_path='./ppocr/utils/ppocr_keys_v1.txt', rec_char_type='ch', rec_image_shape='3, 32, 320', rec_model_dir='/root/.paddleocr/rec/ch', use_angle_cls=False, use_gpu=True, use_space_char=True, use_tensorrt=False, use_zero_copy_run=False) Traceback (most recent call last): File "test4PaddleOcr.py", line 8, in ocr = PaddleOCR(gpu_mem=2000) # 中文显示测试~~~~need to run only once to download and load model into memory File "/tf/ocr/PaddleOCR-1.1.0/paddleocr.py", line 222, in init super().init(postprocess_params) File "/tf/ocr/PaddleOCR-1.1.0/tools/infer/predict_system.py", line 41, in init self.text_detector = predict_det.TextDetector(args) File "/tf/ocr/PaddleOCR-1.1.0/tools/infer/predict_det.py", line 77, in init if args.use_pdserving is False: AttributeError: 'Namespace' object has no attribute 'use_pdserving'

moronism189 commented 4 years ago

升级一下paddleocr

tar了最新的1.1.0……

λ 337789eb7aed /tf/ocr python3 test4PaddleOcr.py grep: warning: GREP_OPTIONS is deprecated; please use an alias or script Namespace(cls=False, cls_batch_num=30, cls_image_shape='3, 48, 192', cls_model_dir='/root/.paddleocr/cls', cls_thresh=0.9, det=True, det_algorithm='DB', det_db_box_thresh=0.5, det_db_thresh=0.3, det_db_unclip_ratio=2.0, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_max_side_len=960, det_model_dir='/root/.paddleocr/det', enable_mkldnn=False, gpu_mem=2000, image_dir=None, ir_optim=True, label_list=['0', '180'], lang='ch', max_text_length=25, rec=True, rec_algorithm='CRNN', rec_batch_num=30, rec_char_dict_path='./ppocr/utils/ppocr_keys_v1.txt', rec_char_type='ch', rec_image_shape='3, 32, 320', rec_model_dir='/root/.paddleocr/rec/ch', use_angle_cls=False, use_gpu=True, use_space_char=True, use_tensorrt=False, use_zero_copy_run=False) Traceback (most recent call last): File "test4PaddleOcr.py", line 8, in ocr = PaddleOCR(gpu_mem=2000) # 中文显示测试~~~~need to run only once to download and load model into memory File "/tf/ocr/PaddleOCR-1.1.0/paddleocr.py", line 222, in init super().init(postprocess_params) File "/tf/ocr/PaddleOCR-1.1.0/tools/infer/predict_system.py", line 41, in init self.text_detector = predict_det.TextDetector(args) File "/tf/ocr/PaddleOCR-1.1.0/tools/infer/predict_det.py", line 77, in init if args.use_pdserving is False: AttributeError: 'Namespace' object has no attribute 'use_pdserving'

加了use_pdserving参数之后…… 又复现原来的错误

ocr = PaddleOCR(gpu_mem=2000, use_pdserving=False) # 中文显示测试~~~~need to run only once to download and load model into memory

File "/tf/ocr/PaddleOCR-1.1.0/paddleocr.py", line 222, in init super().init(postprocess_params) File "/tf/ocr/PaddleOCR-1.1.0/tools/infer/predict_system.py", line 42, in init self.text_recognizer = predict_rec.TextRecognizer(args) File "/tf/ocr/PaddleOCR-1.1.0/tools/infer/predict_rec.py", line 61, in init self.char_ops = CharacterOps(char_ops_params) File "/tf/ocr/PaddleOCR-1.1.0/ppocr/utils/character.py", line 46, in init with open(character_dict_path, "rb") as fin: TypeError: invalid file: PosixPath('/tf/ocr/PaddleOCR-1.1.0/ppocr/utils/ppocr_keys_v1.txt')

λ 337789eb7aed /tf/ocr head /tf/ocr/PaddleOCR-1.1.0/ppocr/utils/ppocr_keys_v1.txt ' 疗 绚 诚 娇 溜 题 贿 者 廖 λ 337789eb

moronism189 commented 4 years ago

已解决。

修改了PaddleOCR-1.1.0/ppocr/utils/character.py 46行

        self.character_str = ""
        with open(str(character_dict_path), "rb") as fin:
            lines = fin.readlines()
            for line in lines:
                                                          48,1          10%