DerJantob / TSW2_Controller

Control TSW2 with a joystick or other controllers
25 stars 4 forks source link

Possible system resources waste related to Tesseract OCR #37

Closed asdf1280 closed 2 years ago

asdf1280 commented 2 years ago
        public static string GetText(Bitmap imgsource)
        {
            var ocrtext = string.Empty;
            using (var engine = new TesseractEngine(@"./tessdata", "deu", EngineMode.Default))
            {
                //engine.SetVariable("load_system_dawg", true);
                //engine.SetVariable("language_model_penalty_non_dict_word", 1);
                //engine.SetVariable("language_model_penalty_non_freq_dict_word", 1);
                using (var img = PixConverter.ToPix(imgsource))
                {
                    using (var page = engine.Process(img))
                    {
                        ocrtext = page.GetText();
                    }
                }
            }
            ocrtext = ocrtext.Replace(",", ".");
            return ocrtext;
        }

The TesseractEngine is initialized every time the function is run. TesseractEngine class prepares for every operation related to screen reading and it wouldn't be appropriate to initialize it every time. Consider initializing TesseractEngine on startup and saving it as a class member variable.

DerJantob commented 2 years ago

grafik True, that would really be a way to speed up text recognition. I measured the time and it took 381ms to initialize Tesseract. I have to see how I implement your suggestion

DerJantob commented 2 years ago

I have changed it in the "Tesseract_improvement" branch and I must say it is a big improvement in speed. The recognition is much faster now. Thank you for pointing it out!

DerJantob commented 2 years ago

Is it ok if I mention your username in the changelog for this improvement?

asdf1280 commented 2 years ago

fine