jpakkane / capypdf

A fully color managed PDF generation library
Apache License 2.0
95 stars 5 forks source link

Page Labels #29

Open doctormo opened 3 weeks ago

doctormo commented 3 weeks ago

Thinking about the best way to add PageLabels (PDF 32000-1:2008 Table 28, row 5, "PageLabels"), also PageLayout but that's much less important.

Should capypdf collect page names into an optional string attached to page_props and write out the PageLabels dictionary in PdfWriter::write_pages_root?

I don't want to start adding until I know which direction you'd like to take it.

jpakkane commented 3 weeks ago

PageLayout is easy, that should go in the document metadata.

PageLabels is trickier. I'm currently traveling, so won't be able to go through this deeply, but it seems like the thing to do would be to have a method like

capy_gen_add_page_label(CapyPDF_Generator *gen, 
  int32_t num,
  StyleEnumType *style, 
  const char *prefix,
  int32_t *numeric_portion);

And then storing those in a vector and writing them out when needed.

But I'm a bit confused about your description. PageLabels is the property of a range of pages but your description seems to be a about names given to individual pages.

(I had to use the non-ISO PDF 1.7 spec for this (table 8.10 in section 8.3.1), as I don't have the 32000 one with me ATM. If that one has an entry for individual page names, then ignore everything I have said above.)

doctormo commented 3 weeks ago

It's possible I'm confused what page labels are. My intention is to give pages names, so they appear correctly in pdf viewers and so opening those pdfs back in inkscape would preserve the page names.

jpakkane commented 3 weeks ago

I don't think PDF has that. Maybe you mean Outlines? That is how LibreOffice Draw writes out named pages at least.

doctormo commented 3 weeks ago

Pretty such it has that.

screenshot-2024-10-19-22-35-48

The labels on the left here (backwards because of a Gtk4 bug) are produced by Cairo when you set a page lebel.

dnoces ehT egaP.png.pdf

jpakkane commented 3 weeks ago

Apparently it does use page labels after all:

1 0 obj
<</Type /Catalog /Pages 3 0 R
/PageLabels<<
/Nums [0 <<
/P (tsrif eht egaP)
>> 1 <<
/P (dnoces ehT egaP)
>>]
>>
/Metadata 11 0 R
>>
endobj

So then that should be implemented with the function as described above.

doctormo commented 3 weeks ago

OK, I'll put this in my low priority pile. Thanks for confirming what needs to be done for this bit of work.