dynobo / normcap

OCR powered screen-capture tool to capture information instead of images
https://dynobo.github.io/normcap/
Other
1.8k stars 91 forks source link

Cannot download other languages. AppImage version #562

Closed jasiralavi closed 6 months ago

jasiralavi commented 7 months ago

What happened?

I recently started using the AppImage version of the software (since flatpak was having issues running) on gnome 45 / wayland / ubuntu 23.10 (new install). The app works perfectly with the default language (EN). However I'm unable to download other languages, I'm getting the below error:

$ ./gearlever_normcap_fabada.appimage
10:28:56 - ERROR   - normcap.gui.downloader:42 - Exception 'The read operation timed out' during download of 'https://github.com/tesseract-ocr/tessdata_fast/raw/4.1.0/ara.traineddata'
10:30:01 - ERROR   - normcap.gui.downloader:42 - Exception '<urlopen error [Errno 104] Connection reset by peer>' during download of 'https://github.com/tesseract-ocr/tessdata_fast/raw/4.1.0/mal.traineddata'

image

How did you install NormCap?

AppImage (Linux)

Operating System + Version?

Ubuntu 23.10

[Linux only] Display Server (DS) + Desktop environment (DE)?

Wayland

Debug log output?*

$ ./gearlever_normcap_fabada.appimage -v debug

10:40:27 - INFO    - normcap:30 - Start NormCap v0.4.4
10:40:27 - DEBUG   - normcap:81 - Set XCURSOR_SIZE=24
10:40:27 - DEBUG   - normcap:86 - Set QT_QPA_PLATFORM=wayland
10:40:27 - DEBUG   - normcap.gui.tray:60 - System info:
{'cli_args': '/tmp/.mount_gearlehcYkEH/usr/app/normcap/__main__.py -v debug', 'is_briefcase_package': True, 'is_flatpak_package': False, 'platform': 'linux', 'pyside6_version': '6.5.1', 'qt_version': '6.5.1', 'qt_library_path': '/tmp/.mount_gearlehcYkEH/usr/app_packages/PySide6/Qt/plugins, /tmp/.mount_gearlehcYkEH/usr/python/bin', 'config_directory': PosixPath('/home/jasir/.config/normcap'), 'normcap_version': '0.4.4', 'ressources_path': PosixPath('/tmp/.mount_gearlehcYkEH/usr/app/normcap/resources'), 'tesseract_path': PosixPath('/tmp/.mount_gearlehcYkEH/usr/bin/tesseract'), 'tessdata_path': PosixPath('/home/jasir/.config/normcap/tessdata'), 'envs': {'TESSDATA_PREFIX': None, 'LD_LIBRARY_PATH': None}, 'desktop_environment': <DesktopEnvironment.GNOME: 1>, 'display_manager_is_wayland': True, 'screens': [Screen(is_primary=True, device_pixel_ratio=1.0, rect=Rect(left=0, top=0, right=1920, bottom=1080), index=0, screenshot=None)]}
10:40:27 - DEBUG   - normcap.gui.tray:342 - Listen on local socket v0.4.4-normcap.
10:40:27 - DEBUG   - normcap.gui.settings:128 - Skip update of non existing setting (cli_mode: False)
10:40:27 - DEBUG   - normcap.gui.settings:128 - Skip update of non existing setting (background_mode: False)
10:40:27 - DEBUG   - normcap.screengrab.utils:79 - Detected Gnome Version: 45.1
10:40:27 - DEBUG   - normcap.screengrab:37 - Select capture method DBUS portal
10:40:27 - DEBUG   - normcap.screengrab.dbus_portal:196 - Request screenshot with interactive=False
10:40:27 - DEBUG   - normcap.screengrab.dbus_portal:79 - Request accepted
10:40:28 - DEBUG   - normcap.ocr.tesseract:23 - Tesseract command output:
List of available languages in "/home/jasir/.config/normcap/tessdata/" (1):
eng
10:40:28 - DEBUG   - normcap.screengrab.dbus_portal:106 - Parse response
10:40:28 - DEBUG   - normcap.screengrab.utils:26 - Virtual geometry width: 1920
10:40:28 - DEBUG   - normcap.screengrab.utils:27 - Image width: 1920
10:40:28 - DEBUG   - normcap.screengrab.utils:28 - Resize ratio: 1.0
10:40:28 - DEBUG   - normcap.gui.utils:22 - Save debug image as /tmp/normcap/1701493828.5197804_raw_screen0.png
10:40:28 - DEBUG   - normcap.gui.window:131 - Create window for screen 0
10:40:28 - DEBUG   - normcap.gui.window:193 - Set window of screen 0 to fullscreen
10:40:28 - DEBUG   - normcap.gui.window:184 - Move window 0 to (left=0, top=0, right=1920, bottom=1080)
10:40:39 - DEBUG   - normcap.gui.tray:255 - Loading language manager...
10:40:45 - DEBUG   - normcap.gui.downloader:62 - Download https://github.com/tesseract-ocr/tessdata_fast/raw/4.1.0/ara.traineddata
10:40:45 - DEBUG   - normcap.gui.downloader:33 - Fallback to ssl without verification
10:40:46 - ERROR   - normcap.gui.downloader:42 - Exception '<urlopen error [Errno 104] Connection reset by peer>' during download of 'https://github.com/tesseract-ocr/tessdata_fast/raw/4.1.0/ara.traineddata'
10:41:06 - INFO    - normcap.gui.tray:506 - Exit normcap (esc button pressed)
10:41:06 - DEBUG   - normcap.gui.tray:507 - Debug images saved in /tmp/normcap
jasiralavi commented 7 months ago

Just thought I should mention a workaround that helps, until you fix this issue.

I'm assuming you have tesseract-ocr installed. Check/install with sudo apt install tesseract-ocr

Downloaded the required files using sudo apt install tesseract-ocr-[lang-code]

Example, in my case, it's Malayalam and the code is mal, ' so install that with: sudo apt install tesseract-ocr-mal This creates the file mal.traineddata under /usr/share/tesseract-ocr/5/tessdata/ folder

Lang code list is here

You can then go to /usr/share/tesseract-ocr/5/tessdata/ and copy the files language file you need to ~/.config/normcap/tessdata/ In my case: sudo cp /usr/share/tesseract-ocr/5/tessdata/mal.traineddata ~/.config/normcap/tessdata/

(or just use the file manager to do it)

dynobo commented 7 months ago

@jasiralavi, thanks for opening this as an extra. And thanks for providing a work around right away :rocket:

However, I'd still love to investigate this a bit deeper, if you like, as might be able to fix this for others, too! :slightly_smiling_face:

  1. I guess you can download the file https://github.com/tesseract-ocr/tessdata_fast/raw/4.1.0/ara.traineddata via browser, correct? So there is no general connectivity issue? Like proxy or something?
  2. Do you use an application firewall that might block those requests?
  3. The timeout for the download is 30sec. It seems unlikely, but is there any chance, that your internet connection was not fast enough to download the 1.4MB file in time?
  4. Could you try to download an even smaller language? Like Cherokee (chr, 0.3MB)?

Thanks!

jasiralavi commented 6 months ago

Sorry for the delay

  1. Yes. I didn't notice any connectivity issue. No proxies, vpn, etc
  2. I'm not using any firewalls
  3. No, I had tried over office WiFi (~120 mbps) as well as through my mobile hotspot connection
  4. Same issue. The popup window appears immediately.

However, the good news is that I updated to the new 0.5.2 version and the problem seems to be fixed. It's working on both AppImage and Flatpak 0.5.2 versions

dynobo commented 6 months ago

Thanks for your response, @jasiralavi ! Closing this issue for now, but feel free to re-open, if the problem occurs again! :slightly_smiling_face: