Open TheWorldEndsWithUs opened 1 month ago
The argument langPath
is set to a directory (either local or a CDN) that Tesseract.js should use to automatically download the correct language data from. Blobs are individual files, so it would not make sense for langPath
to accept blobs.
If you do not want Tesseract to automatically download the correct data from a directory, but rather want to manually write language data to the worker, follow the instructions provided in #794.
Edit: It looks like this question was answered in #794, however that was for an older version, and the answer may no longer be applicable. Would need to think about whether this is possible with the current interface.
I wouldn't mind using an older version as long as it supports word-level OCR and it is mostly stable. If it is possible with the newest version I would prefer that, but beggers can't be choosers. Thanks for your help, I've tried doing a bunch of experiments trying to hot replace the code in the minimized file with a blob link to download it locally, but it didn't work.
The solution linked in #794 works with v4, however no longer works due to the consolidation of the createWorker
, worker.initialize
and worker.loadLanguage
functions that occurred in v5. It should not be hard to add a feature to the current version that supports doing something similar, however this will require an update.
Tesseract.js version (version number for npm/GitHub release, or specific commit for repo)
Describe the bug When running tesseract js in the browser, I'd like to pass the language data via a blob URL because of the restrictions of the environment the code will be running on. However, when I pass the URL to langPath it fails to load the file.
To Reproduce Steps to reproduce the behavior:
Please attach any input image required to replicate this behavior.
Expected behavior A clear and concise description of what you expected to happen.
Device Version: Chrome Browser
Additional context Add any other context about the problem here.