raoyutian / PaddleOCRSharp

PaddleOCRSarp是一个基于百度飞桨PaddleOCR的C++代码修改并封装的.NET的OCR工具类库。包含文本识别、文本检测、表格识别功能。本项目针对小图识别不准的情况下做了优化,比飞桨原代码识别准确率有所提高。 包含总模型仅8.6M的超轻量级中文OCR,单模型支持中英文数字组合识别、竖排文本识别、长文本识别。同时支持多种文本检测。
Apache License 2.0
606 stars 99 forks source link

Not Detect the Space #22

Closed kmuthugtk closed 1 year ago

kmuthugtk commented 1 year ago

Hi,

I have a image with space in between the character. But it's ignore, kindly help me to fix this issue.

Thank you

raoyutian commented 1 year ago

Hi You can try resize the image and try again.

If your image width or height s greater than 2000, In C# Code

//The v3.0.0 is max_side_len

OCRParameter.max_side_len = 2000;

//if you use  a  other version,

OCRParameter.MaxSideLen= 2000;

In C++ code is "max_side_len".

and try again.

thank you

kmuthugtk commented 1 year ago

Hi,

I tried, but still it's same. Here I attached the Image for your reference. ocr_1_original

Expected output: 99702000 292.0087 817618.000 2534 4B10P

If have the space in between the two character, split as separate line

Kindly help me to solve this

Thank you

raoyutian commented 1 year ago

You can try use the en_V3 model.

en_V3 model

1681787951381

Example of using c #

         OCRModelConfig config = new OCRModelConfig();
            string root = System.IO.Path.GetDirectoryName(typeof(OCRModelConfig).Assembly.Location);
            string modelPathroot = root + @"\en_v3";
            config.det_infer = modelPathroot + @"\en_PP-OCRv3_det_infer";
            config.cls_infer = modelPathroot + @"\ch_ppocr_mobile_v2.0_cls_infer";
            config.rec_infer = modelPathroot + @"\en_PP-OCRv3_rec_infer";
            config.keys = modelPathroot + @"\en_dict.txt";
kmuthugtk commented 1 year ago

Hi,

I cannot find any folder like @"\en_v3" and @"\en_dict.txt"

image

raoyutian commented 1 year ago

download url: https://gitee.com/raoyutian/paddle-ocrsharp/tree/dev/models/PP-OCRv3 and download the en_v3.zip

kmuthugtk commented 1 year ago

Thank you. It's helpful for me.

And If I have the flipped image, is it possible to get the correct OCR text.

ocr_1_original

raoyutian commented 1 year ago

你好,项目默认没有开启方向识别参数,对于可能存在旋转的图像或者有旋转角度的文字,你可以设置如下参数

OCRParameter.cls= true;
OCRParameter.use_angle_cls= true;

1681815970032(1)

下面链接,有详细参数的使用说明 https://gitee.com/raoyutian/paddle-ocrsharp/blob/master/doc/UseInCsharp.md

kmuthugtk commented 1 year ago

That's great Solution.

I am using your library in ASP.NET Core WebAPI project, Kindly please help me this path issues

image

image

kmuthugtk commented 1 year ago

Hi,

I cannot able use this ASP.NET Core WebAPI, can't initialise the en_v3 directory. It gives the error as Value can't null in Production environments.

Kindly please help me to solve this issues.

Thank you