breezedeus / CnOCR

CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scenarios and can be used directly after installation. 【基于 PyTorch/MXNet 的中文/英文 OCR Python 包。】
https://www.breezedeus.com/article/cnocr
Apache License 2.0
3.22k stars 501 forks source link

ocr error at specific image with specific model #206

Open safecat opened 2 years ago

safecat commented 2 years ago

Image example: 06_notice_landscape_watermarked

use demo page at hugging face:

(default)det:ch_PP-OCRv3_det,onnx rec:densenet_lite_136-fc,onnx result:ok

det:db_mobilenet_v3_small,pytorch rec:densenet_lite_136-fc,onnx result:[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running Conv node. Name:'199_nchwc' Status Message: Invalid input shape: {4,0}

det:db_shufflenet_v2_small,pytorch rec:densenet_lite_136-fc,onnx result:[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running Conv node. Name:'199_nchwc' Status Message: Invalid input shape: {4,0}

det:db_shufflenet_v2_small,pytorch rec:densenet_lite_136-gru,pytorch result:Given input size: (72x8x1). Calculated output size: (72x4x0). Output size is too small

...

It looks like if I don't use paddlepaddle detect model, there will be some thing wrong with convolutional layer output.

Thanks for all great works!

breezedeus commented 2 years ago

yes, it didn't work for some special types of images, such as images with 4 channels. To work around the problem, you can resave your image to more common formats. When I download your image and put it into the demo, it works normally. image

safecat commented 2 years ago

Thanks for explanation!

safecat commented 2 years ago

Sorry, but it turns out my image is a standard RGB jpeg file, it does not contain 4th channel, and I tried to reproduce the error with file downloaded from this issue page, the error still exists. I uploaded a screenshot to youtube and here is the link https://www.youtube.com/watch?v=LBgjDB3iaS4 . The error occurred at 15s and numpy/fileinfo showed it's a three channel image at 45s/59s. The error disappears when I rotate the image to portrait(with exact same format), so I assume the error is image size related.

breezedeus commented 2 years ago

Thanks for the details. The reason for the error is that the det model you choose can't process images rotated by 90 degrees probably.