py-pdf / pypdf

A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
https://pypdf.readthedocs.io/en/latest/
Other
8.05k stars 1.39k forks source link

PdfWriter.add_named_destination() does not maintain the name tree sort order #1927

Closed robertkearns closed 1 year ago

robertkearns commented 1 year ago

When a named destination is added via writer.add_named_destination() the resulting (name, destination) combo in the named destination list is always pushed to the back of the names list.

This will cause anything (annotations, etc.) using those destinations to not work correctly with some pdf viewers as they are relying on the name list being sorted in lexical order (which is required by the spec). add_named_destination_array() and add_named_destination_object() both insert the new destination at the correct index.

PDF Specification

To quote the "Table 36 – Entries in a name tree node dictionary" the part about the Names key:

The keys shall be sorted in lexical order, as described below. [...]

The Names entries in the leaf (or root) nodes shall contain the tree’s keys and their associated values, arranged in key-value pairs and shall be sorted lexically in ascending order by key. Shorter keys shall appear before longer ones beginning with the same byte sequence. Any encoding of the keys may be used as long as it is self-consistent; keys shall be compared for equality on a simple byte-by-byte basis.

Sample Code

Creating a writer and adding 2 named destinations should be sufficient.

writer = PyPDF2.PdfWriter()
writer.add_blank_page(200, 200)
writer.add_named_destination('b', 0)
writer.add_named_destination('a', 0)
print(writer.get_named_dest_root()) # 'b' will come before 'a'

I do think the best proof of this however is reading the source code for PdfWriter.add_named_destination().

Affected Versions

This issue occurs in pypdf==3.11.1 and likely many prior versions

pubpub-zz commented 1 year ago

I agree this is a bug. Please create a PR and become a active contributor 🙂

MartinThoma commented 1 year ago

@pubpub-zz I love to see that you encourage people to get active as well :heart: It's awesome that you're contributing that much, but on the long run we will need other contributors to ensure pypdf keeps improving :-) Well done :clap: