wolfmanstout / screen-ocr

Easily perform OCR on portions of the screen, choosing from a selection of backends.
Apache License 2.0
37 stars 7 forks source link

Define to recognize only latin characters/words #12

Closed roman-k-tech closed 1 year ago

roman-k-tech commented 1 year ago

Hello, How can I define to recognize only latin chars? Currently sometimes I get mix of latin and cirillic latters. I need somehow to define that screenshot contains only latin latters.

Additinally I'd like to exclude numbers, but first question is primary. WinRT is used and question is for WinRT (tessaract gives worse results and more complex, so not an option), screenshot is being recognized.

wolfmanstout commented 1 year ago

Here's where I set up the WinRT OCR Recognizer: https://github.com/wolfmanstout/screen-ocr/blob/14ae902e06004ac83617048c8a19aca26f0aba0f/screen_ocr/_winrt.py#L27

This uses the languages set up in your user profile. If you want to use a specific language/alphabet, you'd need to edit the code here and use this API instead to construct the recognizer: https://learn.microsoft.com/en-us/uwp/api/windows.media.ocr.ocrengine.trycreatefromlanguage?view=winrt-22621#windows-media-ocr-ocrengine-trycreatefromlanguage(windows-globalization-language)

You can get the list of available Languages on your device here: https://learn.microsoft.com/en-us/uwp/api/windows.media.ocr.ocrengine.availablerecognizerlanguages?view=winrt-22621#windows-media-ocr-ocrengine-availablerecognizerlanguages

If you want to contribute this upstream, you could add an optional "language: str" argument to the WinRtBackend constructor, then iterate through available languages to find the match.

roman-k-tech commented 1 year ago

Thank you!

Your info helped. I managed to create single-language recognizer. Code that I've generated:

        engine = None
        for language in ocr.OcrEngine.get_available_recognizer_languages():
            if language.language_tag == 'en-US':
                engine = ocr.OcrEngine.try_create_from_language(language)
                break

        # engine = ocr.OcrEngine.try_create_from_user_profile_languages()

I'll think about commiting.

roman-k-tech commented 1 year ago

Ok, im trying to create branch in order to create pull request, but seems have no permissions. code I want to add to '_init_winrt' method of 'WinRtBackend' class:

        engine = None
        if language_tag is None:
            engine = ocr.OcrEngine.try_create_from_user_profile_languages()
        else:            
            for language in ocr.OcrEngine.get_available_recognizer_languages():
                if language.language_tag == language_tag:
                    engine = ocr.OcrEngine.try_create_from_language(language)
                    break

'language_tag' parameter is thrown down from 'create_reader' method to WinRtBackend class, to thread executor, to'_init_winrt' method.

wolfmanstout commented 1 year ago

Nice! What do you mean when you say the branch has no permissions? I do want to make sure my repo is pull request friendly.

If you end up blocked I can submit this when I find time, which would be as early as this weekend or later.

roman-k-tech commented 1 year ago

Nice! What do you mean when you say the branch has no permissions? I do want to make sure my repo is pull request friendly.

If you end up blocked I can submit this when I find time, which would be as early as this weekend or later.

I tried to push my branch (in order to create pull reqest from it) but getting error that I have no permissions. Ofc I dont mind if you'll do it by youself, new code is quite short so there should be one variant on implementation.

wolfmanstout commented 1 year ago

Ah, if you cloned my repo directly you need to fork it and switch to your fork:

https://gist.github.com/jagregory/710671

I would recommend following the instructions starting at line 16.

roman-k-tech commented 1 year ago

Done via github gui. Please check

wolfmanstout commented 1 year ago

Thanks for contributing this!