naptha / tesseract.js

Pure Javascript OCR for more than 100 Languages 📖🎉🖥
http://tesseract.projectnaptha.com/
Apache License 2.0
34.09k stars 2.15k forks source link

GPU acceleration? #885

Closed ccruttjr closed 4 months ago

ccruttjr commented 4 months ago

The original Tesseract have the ability to be built/ran with OpenCL. I was curious if that was possible with tesseract.js and/or if it was a feature that could be an option in the future.

Here is the documentation for it:

https://github.com/tesseract-ocr/tessdoc/blob/e08ca44e9a37c21180250d4a924c607d72f3e642/TesseractOpenCL.md

Balearica commented 4 months ago

This is a good question given that (1) there are now ways for browser-based applications to use graphics hardware and (2) the Tesseract documentation claims to have support graphics acceleration, including the page you linked to. However, I do not believe that Tesseract significantly benefits from graphics acceleration (even on desktop), so do not see the cost/benefit here as being worth it.

Even using a desktop build with OpenCL enabled (it is disabled by default), only specific operations are performed using OpenCL, and these are not the operations that cause performance bottlenecks. A slide from 2014 shows a ~25% reduction in runtime using OpenCL. Despite already not being massive, this number probably overstates the benefit. Some of the steps it includes have either been removed in the years since or are skipped when using default settings.

Additionally, for documents that take a long time to recognize, a disproportionate amount of time is attributable to the core LSTM functions, rather than the image processing operations. There is no GPU version of these functions in Tesseract. Therefore, my extremely rough, back-of-the-napkin guess is that we could achieve a 15% performance improvement by using graphics acceleration for certain operations, with the number being lower for images that are actually problematic. As this would require significant effort (not just enabling a build flag) for a modest performance gain, I am open to merging it in if somebody figures out how to do it, however do not plan on working on this personally.

Finally, it's also worth noting that the OpenCL build is disabled by default, and the developers who created it years ago appear to be inactive on the project. That version of the build has had bugs that do not appear for the other version, and based on various discussion in the git issues (e.g. see here) there does not seem to be significant interest in working on it.