OCR Integration Feature Request

yosamsimiti commented 1 year ago

This open-source app, Simple Reader, on MacOS, is able to integrate OCR recognition for comics using Apple’s Vision api/ code/whatever, allowing users to highlight and copy text. https://github.com/MaddTheSane/Simple-Comic

It would really nice if YACReader incorporated this, and took it a step further by allowing one to make notes and highlights, as you would with a regular PDF file app. I know, it’s a lot of code, but this would certainly set YACreader above and beyond any other comic book app out there. I see it as the ultimate evolution of comic book reading apps.

Just throwing this feature request out there!

selmf commented 1 year ago

The current implementation of the reader is not well-suited for such a feature. If I remember correctly, @luisangelsm and I discussed this a few years ago and postponed the implementation until we replace the code responsible for displaying and navigating comics with an implementation that is less of an image viewer and more of an actual document viewer.

As it is now, we are using stock Qt widgets and we are lacking basic features like high quality upscaling, copy and paste from PDF documents and some other stuff.

Since the amount refactoring required for this and other advanced features is quite high, we will probably not work on this for the 9.x versions of YACReader.

There is however a good chance that we include something like this or at least the prerequisites needed to implement it in the 10.x versions.

yosamsimiti commented 1 year ago

The folks at Simple Comic have implemented it rather well, but of course, I don't know the coding difficulties involved. I really appreciate the time taken to respond, however.

A full fledged OCR capable Comic Book Reading app with the ability to highlight and take basic notes, is like really the next step in comic book reading and it allows one to REALLY appreciate a comic better ultimately. PDF Expert for comics, essentially was what I was looking for. Alas, nobody's gotten there yet.

Anyways, appreciate the consideration, will look forward to the 10x edition whenever it comes out.

luisangelsm commented 1 year ago

The iOS version has this feature, it is something that the Apple ecosystem is offering and it can be implemented relatively easy.

Implementing it on the desktop version of YACReader is way more harder because the feature needs to work in all the supported platforms, I am not sure how hard it will be to do something macos specific...

yosamsimiti commented 1 year ago

MacOS is where one needs it most as highlighting and note-taking (of the typed variety) are primarily things we all do with a mouse and keyboard.

Again, Simple Comic has it in MacOS and it's open source (link in initial post), so I guess one could grab code or at least ideas from there. From my understanding the Apple Vision framework API actually does allow this.

Another app that might be worth looking at on the MacOS platform is Superkey, by Rick Hanson. The app literally allows one to search for anything on a screen via typing it out and being able to highlight it. I used it extensively and it's a great example of MacOS's Vision Framework OCR capabilities. Here are the links for Superkey and Apple's vision framework:

https://superkey.app https://developer.apple.com/documentation/vision

Again, I don't code. I just wanted to argue others have done it successfully on the MacOS platform.

Hope this helps.

DavidPhillipOster commented 1 year ago

I wrote the Simple Comic OCR implementation in https://github.com/MaddTheSane/Simple-Comic/tree/arc/Classes/Session/OCRVision/ . MIT license. OCRVision mostly wraps Apple's Vision framework, which is also available on iOS, and would work on an iOS UIImage or CGImage just as well as a macOS one. Selection is just a CGLayer on top of the image. But, in an iOS context there's much less you can do with the text. You could do text to speech, copy to clipboard, and you could have a Find command, and that's about it.

I had to re-implement macOS's NSTextFinder because, although I liked its Find U.I., its API was useless for searching a novel one page at a time: The actual engine of Find is basically -[NSString rangeOfString:options:range:] I just had to add, and implement, th missing NSStringCompareOptions: StartWith and EndWith. Those are in OCR_NSString.h in the OCRVision directory.

luisangelsm commented 1 year ago

@DavidPhillipOster thanks for the info, it is really appreciated. Like I said the chanllenges here would be: 1) try to have a crossplatform solution, and 2) if a crossplatform implementation is too hard deal with Qt and macos underlaying views, but I am not too fond of 2 tbh.

Maybe Tesseract + some custom views could work.

yosamsimiti commented 1 year ago

Don't mean to criticize anyone in their approach to developing software but if there is an ability to create a feature on one platform (MacOS) that's not existent in other platforms (Windows, Linux), that doesn't mean one shouldn't simply not develop it on the basis of wanting cross-platform consistency. I mean, if it can be implemented, why not just do it? The worse that could happen is that users of one platform might complain about not having a particular feature existent on the other platform(s). To which I say, yeah so? If you want a feature, tell the folks at Microsoft or Torvalds about their bad api or whatever. I just don't think you should stymie development of features on any platform. Instead, just develop on each platform as much as you can and let users deal with the differences.

Anyways, I should stress I am by no means a great coder. But this feature -- of being able to make rudimentary annotations in terms of highlighting and notes -- seems like something so basic it should have been implemented eons ago. I mean, hitherto all the cbz and cbr formats have been are collections of jpgs or pngs. OCR'd cbz and cbr elevates things to something substantially more productive. I won't tell you the number of times while reading intense graphic novels (Watchmen, Maus, Game of Thrones graphic novel) where it would have been unbelievably helpful to have the ability to highlight and make notes. Also, I should argue, if you want YACReader user-ship to dramatically increase, adding OCR IS ESSENTIAL. It elevates the comic book reading app into a comic book AND pdf reading app and that like changes everything.

Anyways, I will stop MY whining haha. Again, not a coder but at times like this I seriously wish I was so that I could bring something substantial to the YACReader experience. Alas, I have other things to do so my apologies for whining. Haha.

DavidPhillipOster commented 1 year ago

@yosamsimiti please write to me directly. My email address is just my name at gmail.com . You've raised some issues that I'd like to correspond with you about, but they aren't relevant to this YACReader #361

luisangelsm commented 1 year ago

It is not completed :S

yosamsimiti commented 1 year ago

My apologies. I just assumed the conversation had had reached a point where no one really had anything more valuable to say. I mean, OCR would be great, but it's hard to fully realize. I get that. So I figured I'd stop conversing there. Just didn't want to waste anyone's time thinking more about it as we all kind of just hit a brickwall.

@DavidPhillipOster I don't generally like to share my email with anyone on Github out of privacy concerns. If I have offended you in any way, I just want to apologize straight up. I really didn't mean to annoy or piss anyone off. I have tendency to obsess over things needlessly and it's quality I'm trying to improve on. So sorry in advance. Having OCR would be great, but by no means is it essential. I can live without OCR'd comics. There are far more important things to worry about in this little thing we all call life.

luisangelsm commented 1 year ago

@yosamsimiti the feature request is perfectly fine and it would be a great addition to the app, I don't have time to do it in the sort term but other developers may get interested in doing it, so keeping it open will give it visibility.

rodrigo-sys commented 10 months ago

There is an open source reader app with an ocr feature called seeneva . Here is their repo with the helper scripts to train the ML: ml scripts.

YACReader / yacreader

OCR Integration Feature Request #361