Numbers are not detect well in psenet_r50_fpnf_600e_icdar2017

open-mmlab / mmocr

OpenMMLab Text Detection, Recognition and Understanding Toolbox

https://mmocr.readthedocs.io/en/dev-1.x/

Apache License 2.0

4.27k stars 743 forks source link

Numbers are not detect well in psenet_r50_fpnf_600e_icdar2017 #274

Open kbrajwani opened 3 years ago

kbrajwani commented 3 years ago

Psenet is working great in the alphabets but if we consider the float numbers then it will separate numbers by comma (,) or decimal sign( . ) . Like 1,000,00.00 it will be detect line [ "1" , ",000" , ",00" , "00" ] So it will miss decimal sign or comma sometime. Also it will miss the 1 in 1,000,00.00 so we will only get ,000,00.00 . Can you tell me how can we resolve this issue?

cuhk-hbsun commented 3 years ago

Can you provide the test image to us? We can use it to analyze the issue. Btw, to solve this problem, the best solution may be: train a new model on your dataset instead of use the one trained on icdar2017, since most annotated boxes in icdar2017 is word instead of sentance.

kbrajwani commented 3 years ago

Hey i will try to train model. I want to finetune the model psenet_r50_fpnf_600e_icdar2017 on 200 images. what you think it's sufficient data for finetune?

I have done data preparation https://mmocr.readthedocs.io/en/latest/datasets.html according to this. Then I am running ./tools/dist_train.sh configs/textdet/psenet/psenet_r50_fpnf_600e_icdar2017.py /content/psenet 1 This command but I think it will start training from scratch rather than finetuning. Can you tell me where i can give the model path to start finetuning?

Here is the sample of my data. SECTION C_0-01

cuhk-hbsun commented 3 years ago

I think 200 images is ok for finetuning the model.
Add load_from = "/path/to/pretrained_checkpoint.pth" to the config file psenet_r50_fpnf_600e_icdar2017.py to load pretrained model.

kbrajwani commented 3 years ago

Hi @cuhk-hbsun can you tell me that psenet_r50_fpnf_600e_icdar2017 is trained on synthtext or not? Currently i know that first it's trined on icdar2017 then icdar2015 so it's trained in synthtext also? I am asking because i have trained it on my images but it's not performing well as i have created annotation from google document ocr which divides the words like date 21/10/1997 will be returning as [21, / , 10 , / , 1997] . so i can't show the correct annotation to the psenet training. Thanks

innerlee commented 3 years ago

Please check out the model zoo page. Each model has its training configs.

kbrajwani commented 3 years ago

https://mmocr.readthedocs.io/en/latest/textdet_models.html#psenet Yes i have checked this page but it didn't show the synthtext dataset that's the reason I have asked.

innerlee commented 3 years ago

The page should contain the full reproducible information. If not, then we will fix it.

kbrajwani commented 3 years ago

Hi, i have done the training on my images. I am getting number now but i have one problem of not detecting single or two character word or a small word in images. Can you guide me which parameter i can change to achieve the small text in image. Thanks

kbrajwani commented 3 years ago

One more issue in robust_scanner model it will misclassify / as 1 also if number have .000 then . will classify as C and result we get c000 . Can you tell me how can i make this correct also.

innerlee commented 3 years ago

First thing to try is to check the data. Make sure there are sufficiently large number of high quality training samples for the bad cases

kbrajwani commented 3 years ago

Hey can you tell me how to convert synthtext dataset for text detection? it is given here https://github.com/open-mmlab/mmocr/blob/main/docs/datasets.md but steps are missing to convert. So how can i trained the psenet on synthtext ?

kbrajwani commented 3 years ago

@innerlee can you please guide me about synthtext training for text detection part and which config i can use for psenet to train on synthtext.

innerlee commented 3 years ago

@cuhk-hbsun

kbrajwani commented 3 years ago

@cuhk-hbsun

Hey can you tell me how to convert synthtext dataset for text detection? it is given here https://github.com/open-mmlab/mmocr/blob/main/docs/datasets.md but steps are missing to convert. So how can i trained the psenet on synthtext ?

can you please guide me about synthtext training for text detection part and which config i can use for psenet to train on synthtext.

Sasidev90 commented 3 years ago

Hi, Single character numbers are not detecting properly in 'psenet_r50_fpnf_600e_icdar2017.pth' unable to extract the numeric values, attached the sample image for your reference. Thanks in advance. C0831M_Feb2020_1_2_vv