Shreeshrii / tessdata_shreetest

finetuned traineddata files for tesseract 4.0.0 for testing
153 stars 30 forks source link

How to use this from js? #6

Open alaa137 opened 5 years ago

alaa137 commented 5 years ago

I have 2 questions...

  1. How to use these files from javascript? I've been working with files with .gz extensions...
  2. Is there a trained file that supports floating numbers and a colon? (like time. e.g. 4:25).

Thanks

Shreeshrii commented 5 years ago

Tesseract now supports the use of zipped files of traineddata. You could try zipping the traineddata file in .gz format and using it. I have personally not tried that option.

I have created different versions of digita traineddata as experiment, some with digita and period, others with more punctuation characters. One (or more) f those should have support for numbers and colon.

My suggestion will be to first try out the different traineddata files in command mode with the images that you need to OCR. Once you figure out which is best, zip it and then use in .js.