koppor / jabref

Collection of simple for JabRef issues. Please submit PRs to https://github.com/jabRef/jabref/.
https://github.com/jabRef/jabref/
MIT License
8 stars 13 forks source link

Cleanup functionality for PDFs: Remove watermark #524

Open koppor opened 2 years ago

koppor commented 2 years ago

JabRef has clean up functionalities for BibTeX entries. This issue asks for fixing PDFs, too. Some things have been done with the XMP features.

It would be nice if JabRef could remove a watermark on a single click

grafik

There are solutions outside of jabref (see https://superuser.com/a/536644/138868 and https://github.com/agarden/remove-pdf-watermark/blob/master/removewatermark). However, I would like to have all in a single tool

koppor commented 2 years ago

JabRef could find out which text appears at each page - and remove that

dkokkotas commented 1 year ago

hello @koppor, may I get assigned?

dkokkotas commented 1 year ago

Both suggested solutions are applicable. Additionally, after a brief search, itext7 open source library seems to be suitable for the needs of the cleanup activity. I'm actually wondering what kind of tests we could consider. We just check that there is a valid produced PDF? Or we need to ensure watermark object within the internal structure (raw data) is removed? Thank you.

koppor commented 1 year ago

Please try first to use Apache PdfBox, which is already part of JabRef. - Adding a new dependency will break jlink (again). Moreover, we cannot use AGPL software.