dynobo / normcap

OCR powered screen-capture tool to capture information instead of images
https://dynobo.github.io/normcap/
Other
1.83k stars 91 forks source link

Reduce size for pre-build packages (.dmg, .exe, .appimage) #144

Open akram opened 2 years ago

akram commented 2 years ago

Hi team and thank you for your work,

that would be great if you could reduce the size of the dmg. 105MB is quite big actually compared to similar solutions that are between 1MB et 20MB. Maybe the difference is that OCR happens locally with normcap? and some other apps uses web queries to do the job? That would be good also to explain why normcap is 100MB+

thanks

dynobo commented 2 years ago

Hey @akram, thanks for the great question. File size is always important to me!

Allow me to elaborate a bit on this, so people can better understand the challenges and provide proposals.

Note: Everything that follows is only valid for the prebuild packages of NormCap! (The Python package is by itself really tiny and the (large) dependencies can be shared with other applications, if installed as Python package)

Comparing with size of other tools

Be careful to not compare apples with oranges. If you want to compare NormCap, please compare it with tools which share the following features:

  1. Is available for multiple platforms
  2. Works offline
  3. Uses an Open Source OCR engine
  4. Supports 6 frequently spoken languages OOTB

Every feature above makes the package larger. 1-3 are my developers choices and not to be discussed. 4 exists mostly for historic reasons and can be easily changed.

What makes the prebuild packages large?

Here's an excerpt of the unpacked .dmg package (for Linux and Windows it's similar):

Path Size Comment
/Contents/Resources/app/normcap/ressources/tessdata 73 MB 5 frequently spoken languages (+ German ;-))
/Contents/Resources/app_packages/PySide2 48 MB Qt5 Crossplatform GUI Framework
/Contents/MacOS 12 MB NormCap Binary (Python interpreter?)
/Contents/Resources/app_packages/lib* 5 MB Tesseract OCR + dependencies
/\<everything else > 11 MB Various other dependencies, program code, etc..
Total 150 MB

Let's see where we have options to tweak:

NormCap Binary + Tesseract OCR

Here we have not much room for improvement, I guess. The OCR Framework is crucial and just needs some dependencies. The binary is the Python Interpreter and necessary to run NormCap.

PySide2 + everything else

I already spent quite some effort on stripping away unnecessary dependencies here (see #114 and below). There still might be some room for improvement, but it's probably minor. Suggestions are welcome! :-)

tessdata language files

Here is obviously the easiest way to have a large impact on the package files size. The languages got included in an earlier version, long before the user had the possibility to add additional languages by herself.

Today, it is possible to add languages on demand, but IMHO it is still not super trivial and it has to be done (if needed).

This means, we need to trade-off: Better out-of-the-box user experience vs. file size.

Once upon a time I decided to balance in the direction of the user experience by adding 6 language files which should cover what most people want. That is also more than most people need, but I considered it worth the size. But that's up to discussion, and I would love to hear your feedback on alternatives:

  1. Immediately switch to shipping English only, user has to add additional languages
  2. Keep it as it is for now, until an easier way to add additional languages via a GUI got implemented (my preferred option)
  3. Always keep shipping the n most frequent languages (status quo)

What do you think? Do you have other ideas?

akram commented 2 years ago

Thanks @dynobo for the very detailed and justified explanations.

To make a reasonnable and effective decision, I would say that it is required to know how users use the application. On my side, it is a really sporadic use, specifically when people sends me screenshots of bank account number instead of a clean PDF. As this was happening quite often I need a solution like normcap to capture that by avoiding me tedious read and paste and also to be quite reliablble. So, for users like me, multi language support is not a must have, even no language support is actually sufficient.

There could be for sure other usages, like journalists/writers where people need to capture text from screenshot, espcially press communiqué to report this as readable text.

I would tend to say that single language support + on demand would be a good trade off, but again, it should be required to understand usage.

I am bit concerned with file size these days because in some situations I am lacking an unlimited bandwith data plan, and have to fallback on a paid data plan where every download counts. That forces fair use. And in some countries where this is the default, it is always convenient to have a lightweight solution.

I am sure we will find the best solution...and may that will participate to the almost altmodisch "Green-IT" concept.

dynobo commented 1 year ago

Linking #238 here, which led to a significant reduction, especially for macOS and Linux packages.