Closed programmarchy closed 4 months ago
I'll have a look at it as soon as I have a moment. Thanks very much for driving this here.
@programmarchy Everything seems alright. I'll be able to merge. I'm just not sure about the change from PDFHeader.forVersion(1, 7);
to this.context.header
. I don't understand either the original code nor the new one to be honest :sweat_smile:
@Sharcoux I know that the PDF version dictates which encryption algorithm (represented by V
) is used.
But you ask a good question worth investigating. There may be a reason that version 1.7 was hardcoded. For example, maybe pdf-lib can read version 1.4 PDFs, but only supports writing 1.7-compatible PDFs?
If we determine that pdf-lib will only support writing 1.7 PDFs, then we could eliminate the code that handles encryption for other PDF versions.
Do you have time to run those investigations? Otherwise, we should take the safest approach, which is probably to keep the original code. If that raises problems, we can solve them as they go.
@Sharcoux Agree on taking the safer approach. I'm good only supporting 1.7. So I'll simplify PDFSecurity to only support that version.
Also, this PR has a bug where encrypted PDFs cannot be opened Adobe Acrobat. It's stricter than other readers, so I need to fix that also.
@Sharcoux Agree on taking the safer approach. I'm good only supporting 1.7. So I'll simplify PDFSecurity to only support that version.
Also, this PR has a bug where encrypted PDFs cannot be opened Adobe Acrobat. It's stricter than other readers, so I need to fix that also.
Maybe you should keep the support for other versions and only add a comment above the part: const header = PDFHeader.forVersion(1, 7);
mentioning that investigations are required to know why this is there. This way, we don't lose your work.
@programmarchy
But you ask a good question worth investigating. There may be a reason that version 1.7 was hardcoded. For example, maybe pdf-lib can read version 1.4 PDFs, but only supports writing 1.7-compatible PDFs?
If we determine that pdf-lib will only support writing 1.7 PDFs, then we could eliminate the code that handles encryption for other PDF versions.
pdf-lib does not seem to assume anything other than version 1.7. I can't find any version checks for added functionality or deprecated functionality by PDF version. Maybe that's why version 1.7 is hardcoded because of the simplification.
Add:
I found the comment mentioned by @Hopding
https://github.com/Hopding/pdf-lib/issues/916#issuecomment-925402016
@programmarchy
But you ask a good question worth investigating. There may be a reason that version 1.7 was hardcoded. For example, maybe pdf-lib can read version 1.4 PDFs, but only supports writing 1.7-compatible PDFs? If we determine that pdf-lib will only support writing 1.7 PDFs, then we could eliminate the code that handles encryption for other PDF versions.
pdf-lib does not seem to assume anything other than version 1.7. I can't find any version checks for added functionality or deprecated functionality by PDF version. Maybe that's why version 1.7 is hardcoded because of the simplification.
Add:
I found the comment mentioned by @Hopding
Nice, but then, this means that we should probably keep things the way they were. Can you revert that part and add a comment near the hard codded header linking to Hopding message or this conversation?
I reverted the version header part.
However, there's still the outstanding issue with encrypted PDFs not able to be read by Adobe Acrobat. Still looking into that. Acrobat doesn't seem to output any useful information for troubleshooting. Going to compare with other PDF libs.
Ah, my bad, I saw your comment too late. I merged but I won't release waiting for your fix.
@programmarchy Will you still try to make the pdfs be opened with Adobe Acrobet?
@Sharcoux It's on my todo list but I don't have a timeline of when I can finish it. Partly because of my schedule, and partly because I'm a bit stuck, frankly.
I know that qpdf is able to encrypt the same PDFs I tested with and they open successfully. Their encryption code is here: https://github.com/qpdf/qpdf/blob/main/libqpdf/QPDF_encryption.cc but I haven't finished comparing the implementations.
If we can attract others to help solve the problem I think that'd be ideal.
@PhakornKiong Hope it's okay that I am reaching out to you here. We're working on merging in your PDF encryption implementation into this fork of pdf-lib. The implementation works great for most PDF viewers, however, Adobe Acrobat cannot open the file.
Do you happen to have any insight?
@programmarchy
Try adding types that do not compress in the object stream.
PDFStreamWriter.ts
const shouldNotCompress =
ref === this.context.trailerInfo.Encrypt ||
object instanceof PDFStream ||
object instanceof PDFInvalidObject ||
object instanceof PDFCatalog || //Add
object instanceof PDFPageLeaf || //Add
object instanceof PDFPageTree || //Add
ref.generationNumber !== 0;
and Import.
Because,In PDF Specifications
In linearized files (see Annex F, "Linearized PDF" and Annex G, "Linearized PDF access strategies"), the document catalog dictionary, the linearization dictionary, and page objects shall not appear in an object stream.
Only a simple test, but the error seems to have disappeared.
Ah?
https://github.com/cantoo-scribe/pdf-lib/pull/54#issuecomment-2186396859
Perhaps this should be a separate issue from encryption? There should be an impact on Object Stream output regardless of encryption.
@RippleRurigaki Yep, that fixed the issue with Acrobat! Nice find... I'll open a new PR.
Released in version 2.2.0
Based on PhakornKiong/pdf-lib#dev/DocEncrypt
What?
This PR adds the ability to encrypt a PDF document. It's based on PhakornKiong's PR https://github.com/Hopding/pdf-lib/pull/1015, which sadly did not get merged. I've updated his code to be compatible with the latest changes to pdf-lib. Tried to simplify where possible, minimizing changes to
PDFDocument
andPDFContext
.Why?
I needed PDF encryption on a project. This repo has already merged decryption. Let's have both!
How?
Uses CryptoJS to encrypt streams.
Testing?
I've tested it for my personal use case. I have not implemented unit tests yet. Want some assurance it will get merged before working on unit tests.
New Dependencies?
CryptoJS.
Screenshots
N/A
Suggested Reading?
Yes, also read the Encryption part of the spec.
Anything Else?
Checklist