Open riker1 opened 9 years ago
It's probably time to actually put pdftailor
out there.
We've been using it under the hood in production for a year now, and while it doesn't replace all of pdftk, it does enough for docsplit
to get its job done.
You can skip the pdftk installation process, gem install pdftailor
instead and docsplit will work fine.
Oh another note though, it does use iText under the hood, so if you're worried about iText's AGPL license, pdftailor's not going to help you much there.
thanks for the info. i'm not worried about the AGPL license per se.. it seemed that in the threads that may be a redistribution issue? i'm not sure how that works -- I'm a sysadmin not a lawyer (lol).
After some searching we've found http://qpdf.sourceforge.net/ which seems to be a good replacement for pdftk (at least for encryption).
We switched to paid php packages, since it doesn't have OS dependency http://www.setasign.com/products/
I am a pdftk power user and the dependency problems on GCJ seems to be a big problem for us. We are soon updating our servers to Fedora 22 and Cent OS 7.
Can somebody please give us some information about the current plans for the future of pdftk? Is it in active development or not? What’s the progress on moving away from GCJ?
If not, are there any alternatives for filling out pdf forms from the command line? Thanks in advance.
We had to switch to a paid product, but the product is good and reliable
https://www.setasign.com/products/setapdf-formfiller/details/ On Oct 19, 2015 5:39 PM, "David Vielhuber" notifications@github.com wrote:
I am a pdftk power user and the dependency problems on GCJ seems to be a big problem for us. We are soon updating our servers to Fedora 22 and Cent OS 7.
Can somebody please give us some information about the current plans for the future of pdftk? Is it in active development or not? What’s the progress on moving away from GCJ?
If not, are there any alternatives for filling out pdf forms from the command line? Thanks in advance.
— Reply to this email directly or view it on GitHub https://github.com/documentcloud/docsplit/issues/123#issuecomment-149355293 .
Thanks for your answer.
I did some research on other open source alternatives could not find any at the time which is 8 months back. Not sure if you will have better luck. As for the dependencies, SetaPDF have listed the system requirements on the following page
https://www.setasign.com/support/faq/setapdf/system-requirements/#p-88 https://www.setasign.com/support/faq/setapdf/system-requirements/#p-88
Thanks and Regards, Jeetendra Pujari
On Mon, Nov 2, 2015 at 3:29 AM, David Vielhuber notifications@github.com wrote:
Thanks for your answer.
-
Does SetaPDF have any critical dependencies? Which libraries do they use?
I cannot imagine that there is no other open source tool like pdftk to fill out pdf forms from the command line. But it seems that this is the case. Have I overlooked something?
— Reply to this email directly or view it on GitHub https://github.com/documentcloud/docsplit/issues/123#issuecomment-152950345 .
just a heads up, we're slowly replacing pdftk's feature set w/ PDFium which we've wrapped up into PDFShaver.
At the moment tho we're just using PDFium + FreeImage to generate snapshots of pages.
@knowtheory Would you recommend using PDFShaver over GraphicsMagick for generating the images that Tesseract performs OCR on?
Given this issue is still open, I would like to point out that there is meanwhile a Yum repository at https://copr.fedoraproject.org/coprs/robert/pdftk/ serving a pdftk RPM package for RHEL/CentOS 7 – because I just needed PDFtk myself. However, for the long term a switch (as already mentioned before) might be clever through (rather depending on retired software projects).
Hello all. pdftk / CentOS 7 compatibility is a big problem for me. Also the copr solution is not supported by Rackspace, my sysadmin. pdftk is clearly the best solution but it is not actively maintained and it has legacy which has gone stale. Of course the solution is simple -- fork it!
My company will contribute a bounty of $1,000 to "fix" this issue, which will of course require a LOT of effort and rewriting. We may increase that further, and I invite others to add to that bounty if you can. I will use Bountysource. I will solicit to others that use pdftk (see https://github.com/search?utf8=%E2%9C%93&q=pdftk). I might even get a GitHub ban / warning for this. Oh well, I break rules sometimes.
Before we can offer a bounty, I need to be sure somebody won't collect the bounty and mess everything up. Would somebody here be willing to help with adding a couple VERY simple test cases to the fork and Travis CI integration?
The fork is at https://github.com/fulldecent/pdftk and I have added this information to the README. I would appreciate your thoughts to help make this a success!
I was able to get pdftk working on CentOS 7 by using these two repos.
These commands will get you fully up and running.
wget https://copr.fedorainfracloud.org/coprs/robert/gcj/repo/epel-7/robert-gcj-epel-7.repo -P /etc/yum.repos.d
https://copr.fedorainfracloud.org/coprs/robert/pdftk/repo/epel-7/robert-pdftk-epel-7.repo -P /etc/yum.repos.d
yum install pdftk
For extracting/splitting pages, ghostscript works great.
Also good: poppler. It provides pdfseparate
and pdfunite
.
Coltox script works like a charm. All hail to Robert for providing this solution
I was able to get pdftk working on CentOS 7 by using these two repos.
https://copr.fedorainfracloud.org/coprs/robert/gcj/ https://copr.fedorainfracloud.org/coprs/robert/pdftk/ These commands will get you fully up and running.
wget https://copr.fedorainfracloud.org/coprs/robert/gcj/repo/epel-7/robert-gcj-epel-7.repo -P /etc/yum.repos.d
https://copr.fedorainfracloud.org/coprs/robert/pdftk/repo/epel-7/robert-pdftk-epel-7.repo -P /etc/yum.repos.d
yum install pdftk
I have installed PDFTK using Robert's repo. Its installed correctly but I am using it for Foll Fill which doesnt work.
cpdf (Coherent PDF Command Line Tools) does everything that pdftk can do- and a lot more- except for filling PDF form fields. It's freely available (not-for-commercial-use license) from Github, and its homepage is at http://community.coherentpdf.com. Due to the issues discussed in this thread I switched over to it around six months ago, in place of pdftk, and have been a very happy user. Check out its user manual at that link for the full list of features.
I think filling out forms is the killer feature why we all use pdftk.
Some comments above mention things like splitting, merging, and encryption, so if those are what someone is looking for, and comes across this thread, I thought a mention of cpdf could help them. True enough, it doesn't fill forms, which others need.
cpdf looks nice, but it's closed :(
Unfortunately these links give Error 500:Internal Server Error Anybody has these 2 repos? It's really ass pain to get worked pdftk on Centos7.
Coltox script works like a charm. All hail to Robert for providing this solution
I was able to get pdftk working on CentOS 7 by using these two repos.
https://copr.fedorainfracloud.org/coprs/robert/gcj/ https://copr.fedorainfracloud.org/coprs/robert/pdftk/ These commands will get you fully up and running.
wget https://copr.fedorainfracloud.org/coprs/robert/gcj/repo/epel-7/robert-gcj-epel-7.repo -P > /etc/yum.repos.d
https://copr.fedorainfracloud.org/coprs/robert/pdftk/repo/epel-7/robert-pdftk-epel-7.repo -P > > /etc/yum.repos.d
yum install pdftk
Unfortunately these links give Error 500:Internal Server Error Anybody has these 2 repos? It's really ass pain to get worked pdftk on Centos7.
Was only a temporary issue as it seems: https://fedorahosted.org/fedora-infrastructure/ticket/5376
Just throwing this out there (for the sake of future-proofing your setups). If you don't absolutely have to stick with CentOS, you can switch to another Linux server operating system, such as Ubuntu, which still supports PDFTK and its dependencies.
For instance, Ubuntu 16.04 was released Apr 21, 2016 and the current PDFTK works fine on it. Here's how to install it: http://installion.co.uk/ubuntu/xenial/universe/p/pdftk/install/index.html
If you're on a cPanel server and must stick with CentOS v6, just to have cPanel, this may be out of the question. But if you're able and willing to migrate, you can setup a VPS with a provider such as DigitalOcean, Vultr, or Linode, and use a control panel such as ServerPilot or Laravel Forge to help you manage your server.
From my point of view, recommending (or trying to push) a random Linux distribution that still ships pdftk, is a very bad idea. Pdftk relies on GCJ which is since 2013 in deep maintenance mode only, see also: https://gcc.gnu.org/ml/gcc/2013-11/msg00153.html
Community,
The real issue is that gcc-java, libgcj, and libgcj-devel are essentially EOL’d, dead, buried, over, done, baked, put out to pasture.
Not to mention with the mess Oracle has made of Java (EE mainly) I doubt that the libgcj will ever come out of hibernation mode.
Unless the folks who have developed PDFtk rewrite/rethink their tool to work on modern EL/OS without resorting to using outdated libraries, unsupported distros, and MIA repos of said libraries… I’ll be using a different tool. Mostly pdfhaver and pdftailor do what I need for document cloud.
I’m not a Ubuntu user per-se.. IIRC one of the main reasons for removing GCJ support was that vulnerabilities weren’t being patched. Perusing Launchpad all versions of the libraries aren’t tracking anything upstream. Less and less software uses these libraries (Tomcat for example doesn’t use it since version 7).
I think its just time to move along.
Cheers!
Eric
On Jul 7, 2016, at 2:47 PM, bridgeport notifications@github.com wrote:
Just throwing this out there (for the sake of future-proofing your setups). If you don't absolutely have to stick with CentOS, you can switch to another Linux server operating system, such as Ubuntu, which still supports PDFTK and its dependencies
—
Eric S. Tyrer II
Associate Director – Web and Digital Communications
York College - The City University of New York
94-20 Guy R. Brewer Blvd.
Academic Core Building - STE 1H14
Jamaica, NY 11451
http://www.york.cuny.edu/etyrer
etyrer@york.cuny.edu
(P) 718-262-2466
(C) 347-393-6507
"I have no special talent. I am only passionately curious.” — Albert Einstein
I was able to get pdftk working on CentOS 7 by using these two repos.
Does anybody know what the implications of installing this repo are in terms of the dead dependencies that it presumably brings with it? Is it easy enough to uninstall it and its deps?
The latest version of pdftk has an issue where it won't accept data from stdin when merging forms, so I'm happy to see that "Robert" has included a 1.45 build!
If Robert's repo should disappear, is there a way I can store it locally?
I do not have any plans to let my PDFtk-related repositories die. In case Fedora infrastructure ends the COPR service, this repository will definately come up somewhere else (except there are legal reasons indeed).
All packages in the repository are made to hopefully create no overlap or conflict with any other package and to hopefully not break any other dependency etc. In theory, no other package should depend on the packages provided in my repositories, thus these few packages can be easily uninstalled. No guarantee for anything through ;-)
In case you see any need to mirror my repositories, you could mirror the relevant subdirectories of
locally. Finally, you need to create your own *.repo
files for yum
or dnf
.
I am not sure whether it is clever to hijack this docsplit issue, so if somebody would like to follow up PDFtk on RHEL or CentOS, please send me a message or e-mail directly.
@robert-scheck Robert, I've your same issue, a Centos 6 server running fine with some self made scripts calling PDFtk. Now I'm building a new Centos 7 server for a quite similar purpose and I'm stucked with it. Could you please help me? I have no idea about how to mirror your directories and create a repo file (I always installed via yum). I'm pretty new to Linux logics. Thank you.
Folks, if you are looking for help related to my repository, please send me an e-mail rather adding yet another comment to this issue – please! While I still do not see any need to mirror my repository (and if you don't know how to mirror a repository yourself, you likely shouldn't mirror it, but simply use it), read e.g. http://yum.baseurl.org/wiki/RepoCreate for the basics.
@riker1, We just ran into the same issue. You can certainly use the libgcj repository along with the package @robert-scheck provides. It turns out that libgcj.so.10 from el6 is compatible with el7's shared library bindings for PDFtk. We built some RPMs that include the library from CentOS6, so if you would like a 1-line install then see here depending on your architecture: https://www.globallinuxsecurity.pro/pdftk-works-on-centos-7/
@jamieburchell, Since we packaged the official el6 library, it should be compatible for some time to come.
@bhushangahire, Would you please test and see if this one works with Foll Fill?
@vielhuber, You might try to copy libgcj.so.10* into Fedora 22 and see if it works. I'm not sure if our package is Fedora 22 compatible or not, but it will certainly work with Fedora 19.
Eric Wheeler
@ewheelerinc can you provide spec file?
@ewheelerinc Both RPM files on that page are 404, also I don't feel comfortable installing server software from an alternate source than its creator/distributor.
@jsosic, @bravadomizzou, here is the spec. Also, we fixed the 404.
Really all we are doing is repacking libgcj.so.10* which we pulled out of CentOS 6 libgcj-4.4.7-17.el6. PDFtk was downloaded as an RPM from their site unmodified except that we converted it to a tar and added libgcj. You may need to edit the spec to make it build on your system, but it works in our build environment: https://www.linuxglobal.com/static/blog/pdftk.spec
@ewheelerinc thank you very much! :1st_place_medal:
@coltcox Thank you so much your solution worked for me and saved my day.
I was able to get pdftk working on CentOS 7 by using these two repos.
- https://copr.fedorainfracloud.org/coprs/robert/gcj/
- https://copr.fedorainfracloud.org/coprs/robert/pdftk/
These commands will get you fully up and running.
wget https://copr.fedorainfracloud.org/coprs/robert/gcj/repo/epel-7/robert-gcj-epel-7.repo -P /etc/yum.repos.d
https://copr.fedorainfracloud.org/coprs/robert/pdftk/repo/epel-7/robert-pdftk-epel-7.repo -P /etc/yum.repos.d
yum install pdftk
Unfortunately I am getting below issue:
Failed to set locale, defaulting to C Loaded plugins: priorities, update-motd, upgrade-helper Resolving Dependencies --> Running transaction check ---> Package pdftk.x86_64 0:2.02-1.el7 will be installed --> Processing Dependency: libgcj.so.14()(64bit) for package: pdftk-2.02-1.el7.x86_64 Package libgcj-4.8.5-4.el7.x86_64 is obsoleted by libgcc72-7.2.1-2.59.amzn1.x86_64 which is already installed --> Finished Dependency Resolution Error: Package: pdftk-2.02-1.el7.x86_64 (copr:copr.fedorainfracloud.org:robert:pdftk) Requires: libgcj.so.14()(64bit) Available: libgcj-4.8.5-4.el7.x86_64 (copr:copr.fedorainfracloud.org:robert:gcj) libgcj.so.14()(64bit) You could try using --skip-broken to work around the problem You could try running: rpm -Va --nofiles --nodigest
@ewheelerinc thank you so much sir! :D
As this issue is still open 5 years after my initial post, I would like to add that my COPR pdftk repository for RHEL/CentOS 7 has been deprecated in October 2021 in favor of the new pdftk-java port (being GCJ-free) that is available for RHEL/CentOS 7/8 via EPEL using yum install epel-release
(if not already done before) and finally yum install pdftk-java
. Existing users of my COPR pdftk repository are getting auto-migrated to pdftk-java during the next run of yum update
.
As this issue is still open 5 years after my initial post, I would like to add that my COPR pdftk repository for RHEL/CentOS 7 has been deprecated in October 2021 in favor of the new pdftk-java port (being GCJ-free) that is available for RHEL/CentOS 7/8 via EPEL using
yum install epel-release
(if not already done before) and finallyyum install pdftk-java
. Existing users of my COPR pdftk repository are getting auto-migrated to pdftk-java during the next run ofyum update
.
As I have your COPR repo and pdftk running in production (CentOS 7) are there any incompatibilities or missing features I should be aware of, or is this a like-for-like replacement?
I was recently looking for pdftk support on CentOS 8, noted that the COPR repo didn't have a CentOS 8 version (I did reach out to ask if that would be a thing) and in the absence of a solution stumbled upon this post which might help someone if the pdftk-java version is no use
As I have your COPR repo and pdftk running in production (CentOS 7) are there any incompatibilities or missing features I should be aware of, or is this a like-for-like replacement?
As pdftk-java is a port of old GCJ-based pdftk to Java, it's intended to be a drop-in ("like-for-like") replacement. However some old pdftk bugs (from the GCJ variant) have been fixed already, and pdftk-java upstream is trying to get rid of further old issues, which should make it superior.
Given that the ancient original GCJ-based pdftk gets more painful with each new distribution release (because it requires unmaintained software and additionally, the GCJ-based pdftk development can be considered dead as well), the pdftk-java package in Fedora 33+ and EPEL 7+ will silently replace existing installed pdftk RPM packages.
As I have your COPR repo and pdftk running in production (CentOS 7) are there any incompatibilities or missing features I should be aware of, or is this a like-for-like replacement?
As pdftk-java is a port of old GCJ-based pdftk to Java, it's intended to be a drop-in ("like-for-like") replacement. However some old pdftk bugs (from the GCJ variant) have been fixed already, and pdftk-java upstream is trying to get rid of further old issues, which should make it superior.
Given that the ancient original GCJ-based pdftk gets more painful with each new distribution release (because it requires unmaintained software and additionally, the GCJ-based pdftk development can be considered dead as well), the pdftk-java package in Fedora 33+ and EPEL 7+ will silently replace existing installed pdftk RPM packages.
Just checked in on the aforementioned production server and see it has has been replaced and thank goodness everything is still working as expected. Thank you for providing your repo to facilitate the requirement for all these years.
FYI, the globallinuxsecurity.pro link above is old, here is the authoritative link: https://www.linuxglobal.com/pdftk-works-on-centos-7/
Currently, the best way for a well maintained pdftk
on CentOS/RHEL/Rocky Linux 7, 8 and 9 still is:
yum install -y epel-release
yum install -y pdftk
There is absolutely no need for strange third-party howtos suggesting unverifiable/untrusted packages.
@robert-scheck, is pdftk-java a 100% compatible version with the old pdftk/GCJ version?
(I understand where you are coming from about repo trust, I'm the same way. But FYI, linuxglobal.com is where I work, we released the package, and they use it in production. Its just a re-build of the one from el6 with appropriate el6 deps. At the time I wrote the article, pdftk-java wasn't available for el7.)
@robert-scheck, is pdftk-java a 100% compatible version with the old pdftk/GCJ version?
In all setups in which I was involved, pdftk-java was usable as a drop-in replacement so far. Aside of this, many CentOS setups out there have EPEL enabled, thus they got already auto-migrated from pdftk to pdftk-java over the last nearly two years – and I actually received zero bug reports. Citing from pdftk-java upstream:
The current goals are to keep functionality as compatible with the original as it is reasonable, to fix any issues present in the original (correctness takes precedence over compatibility, see the differences), and to clean up the code. New functionality may be added, but it is not a priority.
While this still leaves some risks for bugs, pdftk-java is well maintained by an active upstream – and also covers modern non-x86 cloud systems (such as ARM64), so I personally always would give pdftk-java a try.
building PDFtk on RHEL 7 currently isn't possible due to upstrean (Fedora) dropping support for libgcj
I'm emailed the autors of PDFtk and they said they're working on it..
I wrote that in August of 2014 and now its near 2015.
There hasn't been any development on libgcj since 2009, reimplementing that libary most likely would be a heavy lift. I'm guessing that Oracle wouldn't be too friendly either since they hold all the Java patents.
Also the licensing for PDFtk's other component, iText, has changed from GPL 2 to GPL 3. This might also affect redistribution?
More reading on PDFtk's death (Fedora Discussion List)
Discussion on CentOS forum on gcc-java and libgcj-devel missing (needed to compile pdftk))
There’s someone who mentions possibly an alternative PDF toolkit at the bottom of the thread..
What does the community think?