JaidedAI / EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
https://www.jaided.ai
Apache License 2.0
24.41k stars 3.16k forks source link

Unable to detect english(latin) letters on indian number plates. #259

Closed developermastermind07 closed 4 years ago

developermastermind07 commented 4 years ago

I want to use easyOCR for number detection of vehicle. But easyOCR was failing to detect. If possible can please guide me train the model using additional dataset? Help will be appreciate. image1 EasyOCR output : 'KA03|1||9993' Expected Output : 'KA03MN9993'

image2 EasyOCR output : 'KA03|1$7979' Expected o/p : KA03MS7979

image3 EasyOCR o/p : R 26B6 Expected O/p: HR 26 BG 0383

image4 EasyOCR o/p : B6?+ 3+5li0 Expected o/p : WB62 B 5170

developermastermind07 commented 4 years ago

@EasyOCR team i have number plate dataset. Can you help me to add my dataset into EasyOCR model/ Thanks for your help

muhk01 commented 4 years ago

I want to use easyOCR for number detection of vehicle. But easyOCR was failing to detect. If possible can please guide me train the model using additional dataset? Help will be appreciate. image1 EasyOCR output : 'KA03|1||9993' Expected Output : 'KA03MN9993'

image2 EasyOCR output : 'KA03|1$7979' Expected o/p : KA03MS7979

image3 EasyOCR o/p : R 26B6 Expected O/p: HR 26 BG 0383

image4 EasyOCR o/p : B6?+ 3+5li0 Expected o/p : WB62 B 5170

to avoid uncessary character just add argument with allowlist, it will ignore character that may never exist in these plates, also you could adjust threshold and recognition width(per character/sentence) reader.readtext(image_in, detail = 0, allowlist = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ',width_ths = 1.5, text_threshold=0.8)

developermastermind07 commented 4 years ago

I want to use easyOCR for number detection of vehicle. But easyOCR was failing to detect. If possible can please guide me train the model using additional dataset? Help will be appreciate. image1 EasyOCR output : 'KA03|1||9993' Expected Output : 'KA03MN9993' image2 EasyOCR output : 'KA03|1$7979' Expected o/p : KA03MS7979 image3 EasyOCR o/p : R 26B6 Expected O/p: HR 26 BG 0383 image4 EasyOCR o/p : B6?+ 3+5li0 Expected o/p : WB62 B 5170

to avoid uncessary character just add argument with allowlist, it will ignore character that may never exist in these plates, also you could adjust threshold and recognition width(per character/sentence) reader.readtext(image_in, detail = 0, allowlist = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ',width_ths = 1.5, text_threshold=0.8)

I run with you recommendation. Performance is increased. Still for image no 2 EasyOCR o/p : KA0311S7979 Expected o/p : KA03MS7979

I think image3 is little bit tilted due to this EasyOCR is not able to detect characters. Now It giving o/p : ['R26B6', '0383']
Expected o/p : HR26BG0383 or Expected o/p HR26BG 0383 Not detecting H and not able to differentiate between G and 6

Image 1 giving correct results now.

Image 4 still not matching any character EasyOCR o/p : IIB635160 Expected O/p : WB62 B 5170 Image4 can be ignored. This type of plate now not supported .

Help will appretiate. Can i add these images in training dataset and can i train latin dataset ?

developermastermind07 commented 4 years ago

@muhk01 Can you please help me here ?

muhk01 commented 4 years ago

@muhk01 Can you please help me here ? certain model has different accuracy for each font trained with, you could try to set different model like 'russian' model reader = easyocr.Reader(['ru'] see does it has different performance in output.

developermastermind07 commented 4 years ago

Thank you @muhk01 for your help . Still OCR facing problem to detect M,H,N,Q

muhk01 commented 4 years ago

Thank you @muhk01 for your help . Still OCR facing problem to detect M,H,N,Q

i do not know, but in my opinion you just can not expect a perfect result of recognition from someone trained model, because from the dataset itself, i mean from font database itself it probably total different with a font you want to recognize with, you may contact this repository owner and ask how to do retrain with your own dataset.

developermastermind07 commented 4 years ago

@rkcosmos @korakot Can you please guide me ? How to train custom dataset using easyocr ? Help will be appreciated..

rkcosmos commented 4 years ago

Our training pipeline is not open sourced yet. Please look into repos in reference section. (README file)