danpla / dpscreenocr

Program to recognize text on screen
https://danpla.github.io/dpscreenocr/
zlib License
236 stars 18 forks source link

Enhancement Suggestions for dpScreenOCR #40

Open ALENZO777 opened 8 months ago

ALENZO777 commented 8 months ago

Dear danpla,

I hope this message finds you well. I am an avid user of dpScreenOCR, and I want to express my appreciation for the excellent work you've done on this application. I have been impressed by its speed and efficiency.

As I've been using dpScreenOCR, I've identified three features that I believe would significantly enhance the user experience:

  1. Auto Startup: It would be immensely beneficial to have an option to enable dpScreenOCR to launch automatically upon system startup. This feature would streamline its accessibility and make it readily available whenever the computer is powered on.

  2. Minimize to Tray: Another valuable addition would be the ability to keep dpScreenOCR minimized and in the system tray upon launch. This would contribute to a cleaner desktop environment while ensuring quick access to the application when needed.

  3. Adjustable Line Concatenation Option: Introducing a feature that allows users to enable or disable line concatenation upon copying text would significantly enhance the readability and organization of extracted content. When activated, this option would intelligently join lines of text, eliminating line breaks and presenting a seamless and cohesive view of the captured information.

Additionally, it seems that the "Split Text Blocks" option is not working a. Your consideration of these enhancements, including investigating the "Split Text Blocks" issue, would be greatly appreciated. I believe these improvements will contribute to making dpScreenOCR an even more versatile and user-friendly tool.

Thank you for your time and dedication to improving this superb application. I look forward to seeing dpScreenOCR evolve with these potential enhancements.

danpla commented 8 months ago

Hello, ALENZO777

Thank you for your suggestions and kind words. I am glad you find the program useful.

Some of these features are already available, but require a little extra work to enable:

  1. There's no "Auto Startup" setting yet, but you can always add dpScreenOCR to autostart manually (for example, here are instructions for Windows 10). In this case, you may also want to enable the ui_window_minimize_on_start option in the dpScreenOCR settings file (read the Tweaking section of the User Manual for details).
  2. "Minimize to tray" is already available as the ui_window_minimize_to_tray option in the settings file (see Tweaking). If dpScreenOCR is added to the autostart and the abovementioned ui_window_minimize_on_start option is also enabled, then the program will be hidden in the system tray on system startup.

Of course, I understand that adding these features directly into the dpScreenOCR interface would be more user-friendly.

Smart line concatenation is a non-trivial task, as it requires the algorithm to understand the context of the text in order to perform a proper de-hyphenation. Using a spell checker dictionary for this task as in #23 showed reasonable results, but it's still not a perfect solution because it only works for a single language, while the text recognized by dpScreenOCR can be a mix of different languages.

"Split Text Blocks" seems to work well: for example, try to recognize this two-column text image in a single capture, and you will see that the text from the first column is followed by the text from the second column. The image from the Split text blocks section of the User Manual may also be an example of how this option works.

ALENZO777 commented 8 months ago

Thank you for your prompt and detailed response. I've successfully configured the ui_window_minimize_on_start and ui_window_minimize_to_tray options using the settings.cfg file. This has enhanced the startup process, and I appreciate your guidance on these features.

Your explanation has clarified the functionality of the "Split Text Blocks" option. I was initially a bit confused about its purpose, but now I understand its effectiveness, especially in handling two-column text images.

Regarding the smart line concatenation, I understand the complexities involved in implementing such a feature, especially with multi-language text. Your efforts in creating and maintaining dpScreenOCR are commendable, even if certain features aren't added, the existing functionality is already incredibly helpful.

On Mon, Jan 22, 2024, 1:54 AM Daniel Plakhotich @.***> wrote:

Hello, ALENZO777

Thank you for your suggestions and kind words. I am glad you find the program useful.

Some of these features are already available, but require a little extra work to enable:

  1. There's no "Auto Startup" setting yet, but you can always add dpScreenOCR to autostart manually (for example, here are instructions for Windows 10 https://support.microsoft.com/en-us/windows/add-an-app-to-run-automatically-at-startup-in-windows-10-150da165-dcd9-7230-517b-cf3c295d89dd). In this case, you may also want to enable the ui_window_minimize_on_start option in the dpScreenOCR settings file (read the Tweaking https://danpla.github.io/dpscreenocr/manual.html#tweaking section of the User Manual for details).
  2. "Minimize to tray" is already available as the ui_window_minimize_to_tray option in the settings file (see Tweaking https://danpla.github.io/dpscreenocr/manual.html#tweaking). If dpScreenOCR is added to the autostart and the abovementioned ui_window_minimize_on_start option is also enabled, then the program will be hidden in the system tray on system startup.

Of course, I understand that adding these features directly into the dpScreenOCR interface would be more user-friendly.

Smart line concatenation is a non-trivial task, as it requires the algorithm to understand the context of the text in order to perform a proper de-hyphenation. Using a spell checker dictionary for this task as in

23 https://github.com/danpla/dpscreenocr/issues/23 showed a good

results, but it's still not a perfect solution because it only works for a single language, while the text recognized by dpScreenOCR can be a mix of different languages.

The "Split Text Blocks" seems to work well: for example, try to recognize this two-column text image https://upload.wikimedia.org/wikipedia/commons/0/03/Hz_Programm.png in a single capture, and you will see that the text from the first column will be followed by the text from the second column. The image from the Split text blocks https://danpla.github.io/dpscreenocr/manual.html#split-text-blocks section of the User Manual may also be an example of how this option works.

— Reply to this email directly, view it on GitHub https://github.com/danpla/dpscreenocr/issues/40#issuecomment-1902760667, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7QWVLTBFPOQZYL7VDNCTHTYPV6CHAVCNFSM6AAAAABCEG6HSOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBSG43DANRWG4 . You are receiving this because you authored the thread.Message ID: @.***>

ALENZO777 commented 8 months ago

Additionally, If it is now possible to add smart line concatenation only for English, then you should add. This would be very helpful.