Closed seehuhn closed 1 week ago
No - the Length entry is always the length of the data (in bytes) between stream
and endstream
keywords, excepting possibly for an extra EOL sequence. It applies to all the filters that are specified (since you can chain/cascade them) and encryption is no different. See Table 5.
PDF 2.0 introduced a new optional entry, DL, to represent the output length of the decoded/decrypted (defiltered) data.
However, a commonly seen extant data error is that many PDF producers get this wrong...
Yes, I agree that this is what is meant. But doesn't the claim that Length
is the "The number of bytes to be encrypted" contradict this? Maybe that statement is even he reason that some PDF producers get this wrong?
Suggestion: Why not replace the quoted sentence near the end of section 7.6.3 with the following:
The value of the
Length
entry in the stream dictionary shall be the length of the encrypted stream data.
- For encrypted documents, section 7.6.3 (General encryption algorithm) states near the end: "The number of bytes to be encrypted or decrypted shall be given by the
Length
entry in the stream dictionary."
I think this sentence makes some sense if you read it case by case:
So in both cases, Length contains the number of bytes between stream[EOL] and endstream.
Actually, though, the shall indeed is wrong, Length contains these numbers by the definition of stream objects, this is no new requirement.
The Adobe Reference here said "The number of bytes to be encrypted or decrypted is given by the Length entry in the stream dictionary." And here the PDF Reference really meant to state a fact, not imply a new requirement. IMO here someone overeagerly added a shall too many during ISO-fication.
I agree - "shall" in this case is wrong. Let's change it back....
PDF TWG agree
The PDF-2.0 spec contains the following information about the
Length
field in a stream dictionary:Length
is "[t]he number of bytes from the beginning of the line following the keywordstream
to the last byte just before the keywordendstream
. It does not mention any special rules for files using encryption.Length
entry in the stream dictionary."I believe that "number of bytes to be encrypted" must be the length of the cleartext. In contract, the "number of bytes to be decrypted" could be read as the length ciphertext, and also table 5 seems to require
Length
to be the length of the ciphertext.What is the correct value for
Length
when encryption is used? It would be nice if the spec would be more explicit about this.