mozilla / pdf.js

PDF Reader in JavaScript
https://mozilla.github.io/pdf.js/
Apache License 2.0
48.35k stars 9.97k forks source link

Duplicate Form field should have the same property #11310

Closed bksantiago closed 3 years ago

bksantiago commented 4 years ago

Attach (recommended) or Link to PDF file here: result-no-sensitive-duplicate-field.pdf

Configuration:

Steps to reproduce the problem:

  1. Set interactiveFields to true
  2. Use the given PDF and open it via PDFJS viewer or in examples/acroforms/acroforms.html

What is the expected behavior? (add screenshot)

What went wrong? (add screenshot)

Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension):

bksantiago commented 4 years ago

Hi @timvandermeij do you know which part of the code is the probable the cause. I have a little bit of understanding of the PDF structure, so i'll see if I can pinpoint the cause and create a PR. Just wanna have a starting point.

timvandermeij commented 4 years ago

The core annotation layer code lives in https://github.com/mozilla/pdf.js/blob/master/src/core/annotation.js, which is most likely where this needs to be changed. I would start by identifying in a PDF browser, for example https://brendandahl.github.io/pdf.js.utils/browser, what properties the annotations have and then find out what PDF.js is not parsing (correctly).

timvandermeij commented 4 years ago

I looked into this and I cannot help but feel like PDF.js (and Okular for example) is doing the right thing. A field is read-only if its read-only field flag is set in the Ff property of the annotation. This corresponds to the first bit of the unsigned 32-bit integer (see https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=441&zoom=auto,-215,356). The annotation on the first page doesn't have a Ff property, but inherits it from its parent annotation since the Ff property is inheritable. This value is 8388609, which is 100000000000000000000001 in binary, so the read-only flag is set (lowest order bit, LSB, so the rightmost bit).

However, the annotation on the second page, while having a parent annotation with Ff = 8388609, defines its own Ff value: 8388608, which is 100000000000000000000000 in binary, so the read-only flag is not set.

I can only imagine that the endianness is different in this PDF file, but I haven't found any resources that state that the PDF format's endianness can differ. Moreover, if the way we interpret the field flags would be incorrect, we wouldn't be able to for example render radio buttons and multiline text fields, so since we can I think our code is actually correct, also looking at how e.g., PDFBox and PDFium implement it.

Therefore, I cannot think of any reason how Adobe Reader can behave differently in this situation...

Snuffleupagus commented 3 years ago

Having also looked at this now, I have to agree completely with the analysis done in https://github.com/mozilla/pdf.js/issues/11310#issuecomment-557826890. All-in-all, let's close this as WONTFIX for the time being since: