TheJoeFin / Text-Grab

Use OCR in Windows quickly and easily with Text Grab. With optional background process and notifications.
https://www.microsoft.com/en-us/p/text-grab/9mznkqj7sl0b?cid=TextGrabGitHub
MIT License
3.18k stars 218 forks source link

The output is being sorted randomly for the 'Extract Text from Images in Folder...' feature. #402

Closed amymor closed 2 weeks ago

amymor commented 9 months ago

Describe the bug I want to extract text from a series of photos and the order of the images is important to me. Let's say I have 1000 images named screenshot (1).png, screenshot (2).png, etc. The extracted text should follow this order by name, but it appears to be extracted randomly.

Where is the bug

To Reproduce Steps to reproduce the behavior:

  1. Go to 'Edit Text Window'
  2. Click on 'Capture' > 'Extract Text from Images in Folder...'
  3. Select Folder
  4. See error

Screenshots 0070

Where did you get Text Grab?

Desktop (please complete the following information):

Additional context I also tried the drag and drop method but the result is the same. Thanks in advance.❤️

TheJoeFin commented 9 months ago

The process of extracting many images is multi-threaded to speed things up. Because of this the threads will return at different times depending on how long it takes to perform the OCR. The only way to keep everything in order would be to:

  1. Force single threaded operation, (much slower to do OCR on large batch)
  2. Complex order tracking and insertion logic to return the results and keep them ordered, (complex and time consuming to write)
  3. Do not return any results until fully complete, (no feedback result, would maybe seem slower). This could be a good option for when OCRing stuff like pages in a book