hellerbarde / stapler

A small utility making use of the pypdf library to provide a (somewhat) lighter alternative to pdftk
Other
283 stars 53 forks source link

Question: changing the dependency from python-pypdf2 to python-mupdf #78

Open m040601 opened 4 years ago

m040601 commented 4 years ago

Considering that python-pydf2, https://github.com/mstamy2/PyPDF2 or even python-pydf3 , https://github.com/mstamy2/PyPDF3 by the same author havent seen any update in the last 2 years.

Would a change to python-mupdf , https://github.com/pymupdf/PyMuPDF , be of any interest for stapler ?

Additional info:

Is PyPDF2 dead? Is there an alive fork? · Issue #571 · mstamy2/PyPDF2 https://github.com/mstamy2/PyPDF2/issues/571

The last commit was from 2018, there are 87 open PRs and 263 open issues. It seems as if the project is dead. Is there an alive fork?

python - Maintained alternatives to PyPDF2 - Stack Overflow https://stackoverflow.com/questions/63199763/maintained-alternatives-to-pypdf2

...not only PDF but also XPS, OpenXPS, CBZ, CBR, FB2, and EPUB formats, .... is hosted on GitHub.....also are registered on PyPI....by far the fastest in all aspects ...Its performance stats are also very promising....

Of special note in python-mupdf is also:

Interesting PDF manipulation and generation functions have been added over time, including metadata and bookmark maintenance,....

See also my notes on the importance of the "metadata" editing thing, https://github.com/hellerbarde/stapler/issues/39

corollari commented 4 years ago

Another possible alternative would be PyPDF4, which received it's last update about 2 months ago. See also issue #48

MartinThoma commented 4 years ago

Although it has a lot of stars, this part of the README makes me doubt that the project is properly managed:

While PyPDF4 will continue to be available at no charge, I have strong plans for better ongoing support to start in August 2018.

Homepage (available soon): http://claird.github.io/PyPDF4/.

The link gives a 404.

MartinThoma commented 4 years ago

Also, look at the commit history:

This looks very much like a side project with very little attention.

tjquinn1 commented 4 years ago

PyPDF4 doesn't have documentation of any sort which should raise a red flag.

corollari commented 4 years ago

Seems like PyPDF4 has several flaws but, on the other hand, PyMuPDF only provides binding for another library, meaning that in some systems usage of stapler will require compiling that library from source (generally not a good user experience). Does anyone know any other alternatives or has any input on that?

Frenzie commented 4 years ago

I like the MuPDF codebase quite a bit (and I'm sure these Python bindings must be fine) but it seems like it might be a bit (or a lot) overkill? See https://ghostscript.com/~robin/mupdf_explored.pdf for some of the things you can do with it. As a user I don't really care as long as it works. :-)

MartinThoma commented 4 years ago

Overview over MuPDF:

Overview over PyMuPDF:

odiebojangles commented 3 years ago

PyMuPDF doesn't allow for working with files in bitstream. That's a killer for a lot of projects right there.

captn3m0 commented 3 years ago

https://github.com/sfneal/PyPDF3 seems to be better maintained than PyPDF4. 3 Patch releases this year: https://github.com/sfneal/PyPDF3/blob/master/CHANGELOG. No docs though.

I migrated https://github.com/captn3m0/pystitcher from PyPDF2->PyPDF3 and it was seamless.

MartinThoma commented 2 years ago

PyPDF2 is now maintained again. I'm the maintainer :-)