CollaboraOnline / online

Collabora Online is a collaborative online office suite based on LibreOffice technology. This is also the source for the Collabora Office apps for iOS and Android.
https://collaboraonline.com
Other
1.82k stars 696 forks source link

Rework the PDF commenting feature to only use pdfium library calls #6278

Open vmiklos opened 1 year ago

vmiklos commented 1 year ago

The current PDF commenting feature works by the following steps:

  1. The pdfium-based PDF import filter creates a Draw document, one Draw page / PDF page and each page has a full-page "pdf image" on it.

  2. The Draw UI is used to add comments.

  3. The exporter is the normal VCL PDF export, which recognizes that the image is a PDF image, so it writes back not the bitmap we got from pdfium but the original pdf data, wrapped in a form XObject, 2 times. For this to work, it uses the vcl::PDFObjectCopier class (all home-grown code).

While this approach mostly works, the third point is rather fragile. A better way would be to have a dedicated PDF export filter in Draw which doesn't invoke the VCL PDF export, but rather takes the original PDF data, loads it into pdfium, removes all comments / annotations in the PDF and creates them (using pdfium library calls) based on the comments / annotations from the Draw document model.

Some benefits of this new approach would be:

  1. No vcl::PDFObjectCopier is invoked during export, which contains a home-grown PDF tokenizer, which is less battle-tested than the one pdfium has internally.

  2. No new objects are created in the PDF file if there are no changes to comments. Currently we add 2 new PDF objects / page on each save, slowly growing the file size and complexity of the document over time.

  3. Draw does not support different page sizes, we currently corrupt such documents with commenting. The "only use pdfium" approach would preserve those unusual page sizes.

  4. The new way would match how other pdf readers allow commenting, by using the same pdf tokenizer for rendering and annotation creation.

I think doing this sooner or later would be nice, especially before we add more annotation types to the vcl-level pdf export. (Certainly a low priority right now.)

@quikee does this make sense to you?

quikee commented 1 year ago

Sounds fine to me, but I think this would really make sense if in addition we introduce a special PDF (viewing) mode for Draw, where we can make sure that the PDF image is always the background of the page with the correct size, and we can control what can and can't be done with the document.