johnfactotum / foliate

Read e-books in style
https://johnfactotum.github.io/foliate/
GNU General Public License v3.0
5.35k stars 260 forks source link

[Feature request] Add pdf and djvu support #139

Open ghost opened 5 years ago

ghost commented 5 years ago

Hi, Please consider adding pdf and djvu format support to this awesome ebook viewer to become a universal ebook viewer. thanks.

artemisresende commented 4 years ago

Hello @johnfactotum, I suggest implement the PDF support with the Mozilla's PDF.js project. I suppose that is a good start because they use only ES6, without other libs and it's too efficient.

The PDF reading feature will be a good functionallity to the app, because this is amazing! Allied with some other new features like book listing (as suggested in another requests), this will be great.

If you need some Pull Request, I can study the app and make when I have some time.

Lastly, I need to applause you because your job is fantastic!!

johnfactotum commented 4 years ago

PDF.js is probably the more sensible option to go with, although I haven't really looked much into it. Other possible ways include: Poppler (probably too low level), Evince, or convert to EPUB (easiest option but the result will be unsatisfactory).

I see that this is a popular request, but I'm a bit unsure whether it would really be a good idea to add PDF/DjVu support because

Basically, I don't see a lot of advantage in using Foliate to view PDF/DjVu files. Maybe one gets to use the dictionary/translation/TTS tools? Or sync reading locations? But those feature can more easily and sensibly be implemented for other PDF viewers or as standalone programs than adding PDF/DjVu capabilities in Foliate.

That being said, I will certainly welcome pull requests, as long as it's not overly complicated or deviate too much from the goals of the project.

artemisresende commented 4 years ago

Yeah, you're right, can be hard and not too attractive to the project.

Initially I suggested this because I think is easier for us - by the user perspective - centralize our documents in only one software, enjoying all these features, like you commented.

technodrome commented 4 years ago

Thanks for the project, it looks very promising. However, I wholeheartedly do not agree there is no need to include PDF support. Quite the opposite. Linux PDF readers are awful, to put it mildly. There is literally only one reader which can invert colors properly - that is configurable black or dark grey background and whiteish text - Sumatra and even that runs only on wine/crossover. Forget highlights/exports. So having one reader which supports both massively popular leading book formats - EPUB and PDF - makes absolutely total sense. Having only one of those will make mass adoption of this possibly very nice and useful software only that - half baked.

I do hope this much-needed feature lands as mozilla's pdf.js is a very nice project and there is support for theming. Highlights and notes are just what every real student/programmer/professional needs. No more proprietary formats.

Add that, notes and highlights with JSON export and you are literally the king of readers. Forget overbloated nonsense like Calibre.

johnfactotum commented 4 years ago

Yes, the lack of a proper night mode is what bugs me about Evince, which I otherwise happily use for PDF files. I think in Zathura you can configure light and dark colors, although I haven't really tried it myself. The current CSS filter based invert in Foliate isn't satisfactory, either. Maybe they can be improved with SVG filters, but I'm not sure.

I'm currently working on a GtkBuilder-based rewrite of Foliate, that will make things cleaner, more maintainable, and address the problems raised in #176. It will make it easier to implement PDF support, because the current code is kind of messy.

itprojects commented 4 years ago

@johnfactotum If one day you do decide to "test" if PDF support could work, perhaps the following could be of use:

  1. Make a new app/project

com.github.johnfactotum.Foliate.pdf

  1. Start with the simplest PDF.js implementation

This is to have one working button - open.

  1. Allow/encourage other people to develop the new app in a direction convergent with Foliate. You become a code reviewer for that project.

If these steps produce something of value that can be merged with Foliate, then the two become one. If not, then no loss - creativity is not a linear process.

Note: the fact that you can make a JS app in Gnome-shell; and it be as powerful as the traditional programming languages (CPP/Java/Python) version, is simply mind-blowing.

johnfactotum commented 4 years ago

For what it's worth, I downloaded PDF.js's prebuilt viewer and used a plain WebView to open it. It works -- you don't even have to implement an "open" button because the viewer already has one, and it can open everything just fine.

But the experience isn't great. It's kind of slow -- a lot slower than your usual PDF.js in Firefox, and for some reason there's no kinetic scrolling. It's easy to test this by simply opening the PDF.js demo in Epiphany. Perhaps it's a WebKit problem (WebKit is not fully supported by PDF.js, according to its FAQ). Maybe one could try making a custom viewer, instead of building on the default one, but it's going to take more time and effort.

It seems to me that it might be easier to use Evince (as a library, not the app) instead. Evince has a clean API directly accessible through GObject introspection, and it supports more formats and performs better than PDF.js.

But in any case, how to implement annotations is going to be the biggest problem. It would be very easy to add PDF support if annotations aren't needed.

itprojects commented 4 years ago

If you're going to implement Evince, maybe it's better to just close this feature request.

Evince already works well enough as a separate app!

technodrome commented 4 years ago

@johnfactotum I am not sure how Evince handles color inversion. Literally every other run-of-the-mill PDF viewer has the same old dumb features with black on white approach. But look what is happening around: more and more apps, even market-leading operating systems are adding dark mode as default. It is becoming mainstream, not everyone loves that hideously glaring white screens so omnipresent on Android.

I have problems with my eyesight so I cannot look at that brutal white background. It is just awful. I am using Dark Reader as Chrome extension which intelligently darkens (inverses) 98 % of webpages to dark mode. It is fantastic. It is dynamic and overridable. Maybe a look at the source code, since it is javascript, could hint a thing or two about possible approaches to implement such beautifully customizable features (its UI offers per-page CSS overrides) we literally desperately need in order to have at least one normal reader which can open worldwide popular formats. I mean it is 21st century and there still is not any clever and sleek app available to do this kind of thing! Just writing this I cannot believe it is so.

Annotations are a big issue, in my opinion. People need it. Researchers do, programmers do, too. Think of people reading books in a foreign language: the same applies. WIthout them, you feel like someone is holding one of your hands behind your back so you cannot resume normal workflow. If you want this - and I dare say very nice app you created - a real thing with real recognition and worldwide usage, you will need annotations, exports, color highlights. This is the basic toolkit of student or pro whatever they do. Can't really go around it. You don't add this, this project just becomes yet another of those countless hapless open-source apps which tried to reinvent the wheel, started with massive enthusiasm just to crash and burn badly because they simply did not take user workflows into consideration and tried to strong-arm the user into something unintuitive.

I do hope I have finally found the ultimate app which would finally, after years, enable me to study and learn with ease. It is just like the difference between Java and Ruby: both will do the job eventually, only the pain involved makes the difference of great versus poor experience for the user.

So fingers crossed!

itprojects commented 4 years ago

@technodrome Thank you for saying, what many of us are thinking.

About colour inversion: mupdf and zathura have color inversion via a tint mode, and it woks great. The only problem is that there is no GUI, it's just keyboard shortcuts. [Zathura requires a small config file change.]

Zathura mode

For those who want to try the zathura feature in the above images:

Make a file /home/MYUSERNAME/.config/zathura/zathurarc

set recolor-lightcolor \#bea58b set recolor-darkcolor \#000000 set default-bg \#bea58b

Then Ctrl+R to change colours.

navidR commented 4 years ago

If you're going to implement Evince, maybe it's better to just close this feature request.

Evince already works well enough as a separate app!

@itprojects No, it doesn't. Evince is an old code base with a lot of lacking the feature. The whole traction for foliate was because it was providing the modern reading experience for users. By modern, I mean annotations, dictionary, etc. Particularly dictionary and lookup feature which is extremely important for people who English is not the first language.

itprojects commented 4 years ago

@navidR Evince does have plenty of technical debts to pay.

Just to clarify:

Evince is an old code base with a lot of lacking the feature.

That's exactly why it would make sense to close the feature request, instead of implementing Evince.

Using PDF.js may or may not yield a different outcome.

Evince already works well enough as a separate app!

The focus is on the words "well enough", it's not perfect, nothing ever is.

johnfactotum commented 4 years ago

@navidR Lookup is indeed a very important feature. I think ideally it should be implemented at the toolkit or desktop level, so it can be used by all applications (like the Look Up feature in MacOS). It probably makes more sense to do it in the shell, as GNOME shell already has a search provider API that can be used to perform the lookup with various apps. Maybe there should be a portal for this.

navidR commented 4 years ago

I think the GNOME Foundation is too busy with funding their useless animations I think. So for the time being if we can get it to work in foliate for both PDF and EPUB, that would be a wonderful step.

ror6ax commented 4 years ago

https://github.com/RussCoder/djvujs seems to be alive and providing necessary functionality.

digitalethics commented 4 years ago

PDF-based ebooks may be hundreds of pages long and contain graphics that require fast PDF rendering engines. Alfresco has benchmarked most of them in their PDF rendering engine performance and fidelity comparison, with MuPDF[License] and pdfium coming out on top. Security-wise and given its licensing the latter is probably the best choice, performance- and feature-wise probably MuPDF. Can anyone help investigate what the current state of annotation features is for these two libraries?

timonson commented 4 years ago

I would also like to mention mupdf as source for implementing the pdf feature.

aquaspy commented 4 years ago

I also would love PDF support so I can use only foliate (the manojority of my files are in pdf and sometimes I really prefer to not convert it so I can keep it readable in all my platforms)

(I approve adding PDF without annotations, at least for now.)

Dr-Terrible commented 4 years ago

But in any case, how to implement annotations is going to be the biggest problem.

The W3C's standard for annotations covers both EPUB and PDF formats; its adoption was already proposed here https://github.com/johnfactotum/foliate/issues/249#issuecomment-593418366.

There isn't the need to implement the entire RESTful part of the W3C standard (which includes both remote sharing and online publishing platforms). Annotations of only local docs is good enough as a starting point.

As a side note, consider that in academy all the commercial software for bibliographic publication/review internally use the W3C standard for annotations. Exporting/importing notes from those software would be a breeze.

johnfactotum commented 4 years ago

The W3C's standard for annotations covers both EPUB and PDF formats

That's good, but I was more referring to how to implementation them using PDF.js or other PDF backends. Most PDF libraries are pretty low level, and even basic stuff like having a selectable text layer is not trivial.[1] The higher level ones might not be easy to modify/extend to work well with the annotation model.

What makes Epub.js so easy to use is that it's high-level enough that you don't need to spend time redoing all the basic things like loading, paginating, adding highlights, etc., but at the same time it remains easily extendable and customizable. With most PDF libraries, it feels that they are either way, way too low level, or there's a built-in UI that can't be easily changed or integrated. (Ultimately, EPUB as a format is itself at a much higher level than PDF, so I guess that's a big reason, too.)

I'm not saying that it's necessarily going to be difficult. And obviously I'm not against having this feature. I guess my points are

[1] Lector, for example, renders PDFs as images. No annotations, not searching, no nothing.

johnfactotum commented 4 years ago

@navidR For what it's worth, right now you can get Gnome Dictionary to look up selected text in Evince with a shortcut, by using a script:

#!/bin/sh
gnome-dictionary --look-up "$(wl-paste --primary)"

Here we use wl-clipboard to get the selected text on Wayland. On X11 one can use xclip.

Then go to GNOME Settings > Keyboard Shortcuts, and set a custom shortcut to run this script. That's it! You can now look up words in Gnome Dictionary from any app!

csrgxtu commented 3 years ago

feature request to support pdf

poke1024 commented 3 years ago

@johnfactotum Really love foliate. Having some support for PDFs would be so great.

I use foliate as a replacement for Apple Books, i.e. as a ebook library management. In this function, it would already be very beneficial if I could add PDF files to the library. Clicking on them could then open them in the standard PDF reader. To have PDF files in a library, having them searchable by name, would already be a great addition.

I think the whole discussion here to include a first-grade PDF reader with lookup is sort of too complex for a first step.

ghost commented 3 years ago

+1 for pdf/djvu support. it would be awesome

larrasket commented 3 years ago

+1

VarLad commented 3 years ago

Well, now lets talk about the complexity of being able to support both

Lets start with djvu. Any of the devs, any ideas?

itprojects commented 3 years ago

Massive and sustained effort will be required to even get to the level of the main Foliate features.

Complexity in this case: writing two new apps, almost completely unrelated to Foliate.

[com.github.johnfactotum.Foliate.pdf]

[com.github.johnfactotum.Foliate.djvu]

To maintain the three formats PDF, DJVU, and EPUB, at least a dedicated develper for each will be required; all year round.

The Library will have to be re-designed.

Using Foliate in combination with Evince makes more sense. The DJVU/PDF Foliate will look and feel too much the same, in every case, because Foliate (rightly) adheres to Gnome design guidelines.

yozachar commented 3 years ago

csbooks does that, but I don't think it's open source. AUR

VarLad commented 3 years ago

But evince doesn't support djvu

Is there any similar lightweight alternative for djvu?

itprojects commented 3 years ago

@VarLad Evince does support DJVU. Do you have the evince-common package?

A lightweight alternative would be zathura (with zathura-djvu zathura-pdf-poppler).

MuPDF? Opens most file formats. Epub and PDF.

im-n1 commented 2 years ago

Just wanna add +1 to the PDF support. It's 2022 and still no great PDF readed on linux that can remember where I left off.

ghost commented 2 years ago

@im-n1 zathura remembers that

knakamura8 commented 2 years ago

Is there any intention to implement this, or some roadmap that I cannot seem to find on the wiki? I appreciate that it seems complex to implement (read the above discussion). That having been said, some of the posts are dated months/years back, so I am curious as to whether or not the position is maintained that this is still a wontfix. Anyway, really enjoy the utility, regardless, cheers to the contributors and maintainers.

StanczakDominik commented 2 years ago

I'd just like to point out that KDE's Okular is great for PDFs, and it doesn't look like it's been mentioned elsewhere in the thread.

johnfactotum commented 2 years ago

Is there any intention to implement this, or some roadmap that I cannot seem to find on the wiki? I appreciate that it seems complex to implement (read the above discussion). That having been said, some of the posts are dated months/years back, so I am curious as to whether or not the position is maintained that this is still a wontfix. Anyway, really enjoy the utility, regardless, cheers to the contributors and maintainers.

I believe at this moment it is still very unlikely to get fixed in the near future. That being said, I do have some thoughts on how this could be implemented eventually.

I think I didn't really think things through on this. My previous statement that it would reuse close to zero code is probably false. Foliate already has basic support for fixed layout EPUBs, and it needs to support them no matter what.

So the most sensible approach would be essentially converting PDF files to fixed layout EPUBs, except preferably in an on-the-fly, on-demand way.

So the key is probably to improve support for fixed layout EPUBs first. Then it should be relatively straightforward to add support for other fixed layout formats.

Noobao commented 2 years ago

added +1 to the PDF support.Foliate is a really amazing application and this is currently the only feature it lacks for it to be my main reader for all my books.

martinpescador commented 1 year ago

+1 .pdf, etc.

My use case involves both .epub and .pdf in the same work flow, namely research. Came here because I was somewhat puzzled by the absence of .pdf support. Haven't head time and inclination to look at the different implementation of readers, and I am only a user of these apps, but I like Foliate and having to use more than one app for the same search is annoying.

Ideally I'd like to have a fully featured reader with integrated file management that allowed complex search on multiple files and able to connect to a bibliographic database, like Zotero. A diverse ecology with different aps is cool, there is freedom to create and scratch your particular itch. All good. Though there is certainly scope for software development in this area.

knakamura8 commented 1 year ago

Will this be part of the roadmap within #962?

johnfactotum commented 1 year ago

Yes. One important feature of the new renderer is that it doesn't require all sections and resources to be loaded. Without this, either you have to convert the whole PDF to EPUB at once, or you'd need a totally different renderer that doesn't share any code with the EPUB renderer. Now it would be possible to implement it by using PDF.js for the rendering of individual pages, but using the same programming interface for handling inter-page layout as fixed-layout EPUB/Kindle or CBZ books.

The fixed layout renderer really needs to be improved first, though. Most importantly, it currently lacks zooming and continuous scrolling. Though I guess this need not be a blocker, as some support would be better than no support at all.

dejalavidavolar commented 1 year ago

t

Yes. One important feature of the new renderer is that it doesn't require all sections and resources to be loaded. Without this, either you have to convert the whole PDF to EPUB at once, or you'd need a totally different renderer that doesn't share any code with the EPUB renderer. Now it would be possible to implement it by using PDF.js for the rendering of individual pages, but using the same programming interface for handling inter-page layout as fixed-layout EPUB/Kindle or CBZ books.

The fixed layout renderer really needs to be improved first, though. Most importantly, it currently lacks zooming and continuous scrolling. Though I guess this need not be a blocker, as some support would be better than no support at all.

thanks!!!!!

pradyumnac commented 1 year ago

Among all readers, folite is the best in terms of snappiness and ux. Only pdf support is missing.

Are you guys working on this ( Specially pdf support)? thats the feature I miss most like others have mentioned in the thread

I dont hve much exp in js (renderer) but I will be happy to help in any way possible

jiiiijiij commented 1 year ago

Add pdf and foliate will rule

payrim commented 1 year ago

could you please just make it so we could add PDF books in the library and open it with external apps (such as zathura?). i like all my books be in one place. <3<3

loynoir commented 10 months ago

Would be nice to have .pdf support within foliate, as both calibre and okular support it.


FYI

https://github.com/search?q=repo%3AKDE%2Fokular%20pdftohtml&type=code

https://github.com/kovidgoyal/calibre/blob/master/src/calibre/ebooks/pdf/pdftohtml.py#L28-L36

https://github.com/kovidgoyal/calibre/blob/master/bypy/linux/__main__.py#L46

https://github.com/kovidgoyal/calibre/blob/master/bypy/sources.json#L378


@johnfactotum

I suggest foliate use pdftohtml as workaround as same as calibre.

johnfactotum commented 9 months ago

Some PDF support has been added in 1512c9d31feaa9c9a3cf5a9085c3c0065baf9de9. I must add that it's very basic, a word here really means "fairly terrible", as in highly bugged and experimental.

The up side is that it was very easy to implement with the new renderer's architecture (only ~100 lines of Foliate's own code).

To make it usable, though, the fixed layout renderer really needs to be improved. It'd probably be rewritten at some point.

Jose-jme commented 3 months ago

Hola, Considere agregar compatibilidad con los formatos pdf y djvu a este increíble visor de libros electrónicos para convertirse en un visor de libros electrónicos universal. gracias.

* [x] pdf

* [ ] djvu

Hola entonces empezamos el proyecto

Jose-jme commented 3 months ago

Cantaten

aehlke commented 1 month ago

Would love to borrow the pdf.js dark mode extensions from https://github.com/shivaprsd/doqment