open-mmlab / mmocr

OpenMMLab Text Detection, Recognition and Understanding Toolbox
https://mmocr.readthedocs.io/en/dev-1.x/
Apache License 2.0
4.37k stars 755 forks source link

OCR for Chinese model: No such file or directory: 'data/chineseocr/labels/dict_printed_chinese_english_digits.txt' #718

Closed liuke97 closed 2 years ago

liuke97 commented 2 years ago

Here is my command: python mmocr/utils/ocr.py demo/test.jpg --det PS_IC15 --recog SAR_CN --imshow --output demo/result.jpg

FileNotFoundError: SARNet: AttnConvertor: [Errno 2] No such file or directory: 'data/chineseocr/labels/dict_printed_chinese_english_digits.txt'

I have read the document and find the explanation about recognize model-SAR , your team has given 'Results and Models ' and 'model ','log','dict' for download , but only model can be obtained , 'dict_printed_chinese_english_digits.txt' looks garbage characters in html and cannot be download. By the way, do you mean that I can use the model to realize Chinese OCR without training by myself? I am looking forward for your answer , thank you very much.

cuhk-hbsun commented 2 years ago
  1. You can download the file by click dict directly and save as txt format from page
  2. You can use this model to recognize Chinese character. Pls have a try.
liuke97 commented 2 years ago
  1. You can download the file by click dict directly and save as txt format from page

    1. You can use this model to recognize Chinese character. Pls have a try.

Thank you for your reply. I have successfully recognized Chinese character and want to evaluate the metrics. My command: python tools/test.py configs/textrecog/sar/sar_r31_parallel_decoder_chinese.py sar_r31_parallel_decoder_chineseocr_20210507-b4be8214.pth --eval acc error: AssertionError: UniformConcatDataset: OCRDataset: HardDiskLoader: data/chineseocr/labels/test.txt is not exist I come to https://github.com/chineseocr/chineseocr to find the link https://nathan6.diskstation.me:5001/fsdownload/uT32hAjbx/ocr-data%20(github.com-chineseocr) for dataset . But I could not get 'test.txt'. I will be grateful if you could tell me how to get it. Thank you!

cuhk-hbsun commented 2 years ago

You can prepare your custom dataset as label.txt and imgs under tests/data/ocr_toy_dataset. For example, you have data/some_dataset/imgs and data/some_dataset/label.txt, just replace line with data/some_dataset/imgs, and replace line with data/some_dataset/label.txt

4power commented 2 years ago

@liuke97 @cuhk-hbsun , i faced the same path problem. will you please tell me , what is the right path.i have tried "/data/chineseocr/labels","/mmocr/data/chineseocr/labels",and many other paths.but, they all threw the same exception "No such file or directory: 'data/chineseocr/labels/dict_printed_chinese_english_digits.txt'". i have down load the dic file,but i don't know how to place it. by the way ,i use model serving ,not the ocr.py. every time i change a path , i reconvert the model from MMOCR to TorchServe.

974187271 commented 2 years ago

@liuke97 @cuhk-hbsun , i faced the same path problem. will you please tell me , what is the right path.i have tried "/data/chineseocr/labels","/mmocr/data/chineseocr/labels",and many other paths.but, they all threw the same exception "No such file or directory: 'data/chineseocr/labels/dict_printed_chinese_english_digits.txt'". i have down load the dic file,but i don't know how to place it. by the way ,i use model serving ,not the ocr.py. every time i change a path , i reconvert the model from MMOCR to TorchServe.

hi man, have u fixed that problem ? I met the same issue as urs when i am trying to make a prediction from docker