TheJoeFin / Text-Grab

Use OCR in Windows quickly and easily with Text Grab. With optional background process and notifications.
https://www.microsoft.com/en-us/p/text-grab/9mznkqj7sl0b?cid=TextGrabGitHub
MIT License
3.18k stars 218 forks source link

Adding multilanguage selection #418

Open vivadavid opened 7 months ago

vivadavid commented 7 months ago

Hello,

I'd like to suggest adding the possibility of selecting more than one OCR language.

Talking about my personal experience, the text of the vast majority of my images is in English or Spanish, so it'd save me time if I could just have both Spanish and English permanently selected unless I'm sure that I'll stick to one language for a good while.

In many cases, switching from one language to another for each individual image might not be worth it, and this multilanguage support would prove even more useful when using the Extract Images from Files in Folder tool, as you may be dealing with many different images, each in a particular language.

I can also think of a scenario where a number of images contain multilingual text, but this might not happen frequently.

As I rarely use a different language from Spanish and English, this feature would be good enough for me, but another (complementary) approach to consider would be to add an automatic mode where the OCR engine detects the language. I'm not sure that I have seen this in Tesseract, but I saw it in Whisper the other day and I thought it was a good idea.

Anyway, these are my suggestions, hoping that you may take them into account for a future release.

Thank you for your time.

morozover commented 5 months ago

I need this feature too 👍

TheJoeFin commented 5 months ago

Just to layout what is possible vs what would be ideal. Tesseract could do multi-language, but the Windows OCR API cannot do multi-language. So if this feature was implemented it would only be possible during FullScreen Grab and batch processing through the Edit Text Window.

vivadavid commented 5 months ago

Hi! That would be perfect for me, and probably great for many users too.