Open Shreeshrii opened 5 years ago
* ssdPicture1 LANG 7seg TESSDATA tessdata_ssd OEM 1 PSM 6 Failed to load any lstm-specific dictionaries for lang 7seg!! CO2
* ssdPicture1 LANG ssd_alphanum_plus TESSDATA tessdata_ssd OEM 1 PSM 6 C02
* ssdPicture2 LANG 7seg TESSDATA tessdata_ssd OEM 1 PSM 6 Failed to load any lstm-specific dictionaries for lang 7seg!! E09
* ssdPicture2 LANG ssd_alphanum_plus TESSDATA tessdata_ssd OEM 1 PSM 6 83
* ssdPicture3 LANG 7seg TESSDATA tessdata_ssd OEM 1 PSM 6 Failed to load any lstm-specific dictionaries for lang 7seg!! 171171 171 D0
* ssdPicture3 LANG ssd_alphanum_plus TESSDATA tessdata_ssd OEM 1 PSM 6 0 0 0 1
Same image as above, but with blur, greyscale and convert to black and white to remove gaps
* ssdPicture3-bw LANG 7seg TESSDATA tessdata_ssd OEM 1 PSM 6 Failed to load any lstm-specific dictionaries for lang 7seg!! 888
* ssdPicture3-bw LANG ssd_alphanum_plus TESSDATA tessdata_ssd OEM 1 PSM 6 888
thank you for this great work, can you show how I can test it for my dataset, what file should I Download and how to use it ? Thank you @Shreeshrii
wget https://github.com/Shreeshrii/tessdata_ssd/raw/master/7seg.traineddata
Copy the traineddata file to your tessdata-dir (where other traineddata files are).
Check with
tesseract --list-langs
Use with
-l 7seg
Similarly for the other trained data files.
Thank you @Shreeshrii you 've done a great work, how I can train the model with my own dataset, I have a large dataset of multimeter seven segment, but it's not annotated, could you help me for the procedure for training and the different tools that I need ? I have seen https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 but it's not too clear for me. Thanks,
see https://github.com/Shreeshrii/tessdata_ssd and https://github.com/Shreeshrii/tessdata_ssd/blob/master/finetune.sh
Modify and run with your text.
For training with images see https://github.com/OCR-D/ocrd-train
Test of a small sample of real life images gives better results with the older 7seg.traineddata. Unfortunately I have deleted the source training_text for the same. Customizing the training text and fonts based on the requirements as well as preprocessing the images to reduce the gaps will lead to better recognition.
* ssd202 LANG 7seg TESSDATA tessdata_ssd OEM 1 PSM 6 Failed to load any lstm-specific dictionaries for lang 7seg!! 2.02
* ssd202 LANG ssd TESSDATA tessdata_ssd OEM 1 PSM 6 1 2.02
* ssd1 LANG 7seg TESSDATA tessdata_ssd OEM 1 PSM 6 Failed to load any lstm-specific dictionaries for lang 7seg!! 22.0
* ssd1 LANG ssd TESSDATA tessdata_ssd OEM 1 PSM 6 22.12
* ssd2 LANG 7seg TESSDATA tessdata_ssd OEM 1 PSM 6 Failed to load any lstm-specific dictionaries for lang 7seg!! 29: L. I1.0
* ssd2 LANG ssd TESSDATA tessdata_ssd OEM 1 PSM 6 29 11.0
* ssd3 LANG 7seg TESSDATA tessdata_ssd OEM 1 PSM 6 Failed to load any lstm-specific dictionaries for lang 7seg!! 4:05:30
* ssd3 LANG ssd TESSDATA tessdata_ssd OEM 1 PSM 6 801
* ssd4 LANG 7seg TESSDATA tessdata_ssd OEM 1 PSM 6 Failed to load any lstm-specific dictionaries for lang 7seg!! 10.5°
* ssd4 LANG ssd TESSDATA tessdata_ssd OEM 1 PSM 6 10.5°
* ssd5 LANG 7seg TESSDATA tessdata_ssd OEM 1 PSM 6 Failed to load any lstm-specific dictionaries for lang 7seg!! 4:05:30
* ssd5 LANG ssd TESSDATA tessdata_ssd OEM 1 PSM 6 4:05:30
* ssd6 LANG 7seg TESSDATA tessdata_ssd OEM 1 PSM 6 Failed to load any lstm-specific dictionaries for lang 7seg!! 10.5°
* ssd6 LANG ssd TESSDATA tessdata_ssd OEM 1 PSM 6 10.5°
* ssd7 LANG 7seg TESSDATA tessdata_ssd OEM 1 PSM 6 Failed to load any lstm-specific dictionaries for lang 7seg!! 29:
* ssd7 LANG ssd TESSDATA tessdata_ssd OEM 1 PSM 6 11
* ssd8 LANG 7seg TESSDATA tessdata_ssd OEM 1 PSM 6 Failed to load any lstm-specific dictionaries for lang 7seg!! 05:54:09
* ssd8 LANG ssd TESSDATA tessdata_ssd OEM 1 PSM 6 05:54:09
* ssd9 LANG 7seg TESSDATA tessdata_ssd OEM 1 PSM 6 Failed to load any lstm-specific dictionaries for lang 7seg!! 7:45
* ssd9 LANG ssd TESSDATA tessdata_ssd OEM 1 PSM 6 7:45