gettalong / hexapdf

Versatile PDF creation and manipulation for Ruby
https://hexapdf.gettalong.org
Other
1.24k stars 70 forks source link

Adobe Acrobat Chrome extension doesn't allow read-only opening of HexaPDF encrypted PDFs #243

Closed johnobrien-singletrack closed 1 year ago

johnobrien-singletrack commented 1 year ago

Hi,

We are migrating a system that applies encryption to PDFs to use HexaPDF instead of a previous dependency and we have an issue with users who have the Adobe Acrobat Chrome Extension (https://chrome.google.com/webstore/detail/adobe-acrobat-pdf-edit-co/efaidnbmnnnibpcajpcglclefindmkaj).

Please find details and reproduction steps below. Any help would be appreciated. It's possible that we are misusing HexaPDF in some way, or there may be an issue with HexaPDF. In either case, any guidance would be great.

Please let me know if you have any questions or need clarification on anything here.

Thanks for your time.

Expected behaviour

When a user opens a PDF encrypted by HexaPDF with an owner password - but no user password - in Chrome, the Adobe Acrobat Chrome extension should open the document without prompting for a password.

Actual behaviour

When a user opens such a PDF, they are prompted for a password. In addition, the owner password is not accepted. Cancelling the password prompt opens the document in Chrome itself (i.e. without using the Adobe Acrobat Chrome extension).

Reproduction steps

We have a simple example PDF ("unencrypted.pdf") which has no encryptions and opens in the extension without a prompt.

We ran the following

doc = HexaPDF::Document.open('unencrypted.pdf')
doc.encrypt(name: :Standard, owner_password: 'password')
doc.write('encrypted_with_hexapdf.pdf')

to generate an encrypted PDF with an owner password, and using default encryption algorithm (AES-128).

Opening this document in the Chrome extension (only possible when the file is online, e.g. having emailed it to yourself) prompts for a password to read the document.

We also created a PDF that was encrypted using Adobe Acrobat (paid, and for Mac, if relevant) for comparison. We made sure to specify the same algorithm, so it should also be AES-128. This PDF opens in the Chrome extension just fine.

Attachments

unencrypted.pdf encrypted_with_hexapdf.pdf encrypted_with_adobe.pdf

gettalong commented 1 year ago

Thanks for reporting!

I have looked at the encrypted PDFs and they indeed use the same procedure for encryption. The Adobe encrypted version is additionally linearized and uses cross-reference and object streams while the HexaPDF version only does the encryption.

The stored encryption object is also basically the same, with the only difference being in the permissions (apart from the two entries that have to be different). But that should not have any influence with respect to showing the password prompt.

Before I investigate further: Does the PDF open fine when opened through Adobe Acrobat?

gettalong commented 1 year ago

Could you please also test the chrome extension with the following files (the minimal.pdf is a minimal, unencrypted PDF file, the others are encrypted with various encryption schemes using HexaPDF and the owner password 'password'):

minimal.pdf minimal_aes_128.pdf minimal_aes_256.pdf minimal_arc4_48.pdf

johnobrien-singletrack commented 1 year ago

Hi, thanks for the very quick investigation.

All of my PDFs work fine when opened in Adobe Acrobat Reader itself (i.e. even the HexaPDF encrypted document can be opened and read without a password being requested).

For your PDFs, minimal_aes_128.pdf had the unwanted behaviour in the Chrome extension: it asked for a password. What's more, when I entered the password password, I was told that it was invalid. All three other PDFs behaved as expected however, with no password prompt.

gettalong commented 1 year ago

Thanks for the information!

So it seems that the chrome extension is somehow unhappy with AES-128bit. Since the file works in Adobe Reader itself, my guess is that this is some quirk on the chrome extension/Adobe web utils side.

As I wrote before with respect to the encryption there is not much difference between the HexaPDF generated file and the Adobe generated file.

To potentially narrow it down, the following script creates 510 PDF files from the minimal.pdf:

require 'hexapdf'

perms = HexaPDF::Encryption::StandardSecurityHandler::Permissions::SYMBOL_TO_PERMISSION.keys

[nil, 'password'].each do |owner_password|
  1.upto(perms.size) do |n|
    perms.combination(n) do |comb_perms|
      HexaPDF::Document.open(ARGV[0]) do |doc|
        doc.encrypt(permissions: comb_perms, owner_password: owner_password)
        doc.write("minimal_#{owner_password ? 'owner_' : ''}#{doc.trailer[:Encrypt][:P].abs}.pdf")
      end
    end
  end
end

It creates one PDF for each possible combination of permissions, and combines that with either an owner password set or without an owner password.

Testing those on a desktop app would just involve selecting a bunch of them in the file manager and opening them all at once. The ones asking for a password would need further investigation. I'm not sure how easily this can be done with the chrome extension.

You could start with minimal_owner_1052.pdf which has the same permissions set as the encrypted_with_adobe.pdf file.

johnobrien-singletrack commented 1 year ago

Hi, thanks for this.

I've generated the 510 files. It's pretty slow to check files with the Chrome extension since I have to do it individually and by making the file online (by emailing; GDrive doesn't seem to play nicely) so I just did a few.

I tested both minimal_owner_1052.pdf and minimal_1052.pdf and they both prompted for a password. This seems especially strange for the non-owner passworded file.

I also tested four other files chosen randomly (specifically minimal_2824.pdf, minimal_1800.pdf, minimal_owner_1548.pdf and minimal_owner_788.pdf). All of these had the same behaviour: the Chrome extension asked for a password.

All six files opened in Adobe Acrobat Reader without issue; the file was readable without a password.

Happy to check more of these if you think it would be relevant, or look into other suggestions you have.

gettalong commented 1 year ago

I'm a bit lost here, to be honest. So it doesn't seem related to the encryption permissions but appears to be specific to when AES 128bit encryption is used.

Could you test the following two files encrypted with different CLI utils: minimal-cpdf.pdf minimal-qpdf.pdf

And for good measure also this one encrypted with HexaPDF: minimal-hexapdf.pdf

johnobrien-singletrack commented 1 year ago

Hi,

cpdf and qpdf both open as readable in the Chrome extension. hexapdf requests a password as before.

In Adobe Acrobat, all three behave correctly.

My instinct is that the Chrome extension is failing to decrypt the HexaPDF file, and mistakenly thinking it needs a user password as a result. This is purely speculation, and may not be useful to you.

Thanks for the continued investigation.

gettalong commented 1 year ago

Hi @johnobrien-singletrack,

Thanks for testing. I have created another set of three files for you to test please:

minimal-hexapdf.pdf minimal-hexapdf-V1.pdf minimal-hexapdf-V2.pdf minimal-hexapdf-V3.pdf

The first one is the same as the one before but with permissions and owner password adjusted to match the qpdf one. The next two are variations with each having a small change based on what I found by comparing the qpdf and HexaPDF versions. And the last one incorporates both variations.

I think you are correct with your speculation, however, if testing the above brings no insight I'm not sure what to do next...

johnobrien-singletrack commented 1 year ago

Hi,

It seems like we might be making progress:

V1 and V3 both open as readable in the extension V2 (and the first file) both request a password in the extension.

All four files open fine in Adobe Acrobat.

I read through the issue you mentioned this one in this morning, but I'll confess that a lot of it went over my head. If there's something you'd like me to investigate, let me know though.

Hope this helps

gettalong commented 1 year ago

Thanks for testing and working with me on this!

If V1 works, it is the best result as this is a minimal change with minimal impact in terms of performance and resulting PDF file size.

The change adds the optional /Length entry to the encryption dictionary. Here is the description for this entry from the PDF 2.0 specification:

(Optional; PDF 1.4; only if V is 2 or 3; deprecated in PDF 2.0) The length of the file encryption key, in bits. The value shall be a multiple of 8, in the range 40 to 128. Default value: 40.

So this key should only be used if the /V key is 2 or 3. In our test files, /V is 4, so this entry doesn't need to be set and should actually be ignored as the key length is always 128bit. This means that this is clearly a bug in the Acrobat Chrome extension!

However, since this change is minimal and should not be problematic to conforming readers, I will add a fix to HexaPDF.

The issue I referenced concerns the V2 version as there are two ways to serialize strings and HexaPDF uses the shorter version for space savings. However, the qpdf and cpdf version use the other version, so this could also have been the problem.

johnobrien-singletrack commented 1 year ago

Great, glad we have gotten to the bottom of it.

What are next steps? For now, we are using an alternative algorithm (ARC4; would you advise something else for any reason?) but should I expect an update to HexaPDF that adds this Length entry in the near future?

Do you think it's worth raising as a bug report for the Chrome extension itself? I haven't gone so far as to investigate if it's also an issue in extensions for other browsers, etc.

Thanks for your help

gettalong commented 1 year ago

I would use AES 256bit in the meantime and not ARC4 since that is deprecated. However, there will be a release with the change later today.

Do you think it's worth raising as a bug report for the Chrome extension itself? I haven't gone so far as to investigate if it's also an issue in extensions for other browsers, etc.

Yes, I think it would help if the Chrome extension would work correctly and that that bug gets fixed on their side.

gettalong commented 1 year ago

I just released 0.32.1 with the fix.