tsgrp / HPI

OpenContent Management Suite (OCMS)
http://www.tsgrp.com/products
17 stars 5 forks source link

View Annotated PDF Action with PDF.js #533

Open benallenallen opened 9 years ago

benallenallen commented 9 years ago

The "Show annotated version in document viewer" admin action seems to not display annotations as we would expect.

image

When this is on, the getAnnotatedPDF call produces a pdf that burns the annotations onto the PDF so that the viewer should be able to display the annotations. It seems like only the "Sticky Note" type annotations are being displayed when this endpoint is hit.

A few leads/details:

1) When a PDF is saved directly from Acrobat with its annotate tools creating the PDF, that views fine:

2) The sticky notes work, so is iText "incorrectly" burning in the annotations that are not properly following the spec so that viewer.js cannot display them? My guess is that it could be viewer.js's problem since they only recently have started displaying annotations.

3) When I hit the getAnnotatedPdf endpoint and view the PDF directly in Acrobat, everything works as would be expected

jrubins commented 9 years ago

After looking into the issue, this is an issue with PDF.js support for annotations at the moment. If you look at the annotation data that comes back in viewer.js's call getAnnotations, you'll see that the proper annotation data is getting returned. However, there's a line like the following:

if(!data || !data.isHtml) {
    continue;
}

Only sticky note annotations have the isHtml value set to true on the returned annotations (the other annotation types don't even have a property isHtml). This leads me to believe only Sticky Note annotations are supported by PDF.js at the moment (supported by https://github.com/mozilla/pdf.js/pull/5065 although that claims Highlight annotations have been fixed which they don't seem to be).

Adobe seems to be doing something special to "burn" the annotations into the actual content of the PDF, which is why those annotations are visible in PDF.js. You can confirm that Adobe does something special by opening a burned in PDF from OpenAnnotate in Adobe, save that PDF and then upload that PDF to the PDF.js viewer (the annotations will then be visible).

We decided to hold off on working on this anymore until PDF.js finishes their re-write of the annotations layer.

gsteimer commented 9 years ago

So this is interesting:

  1. Load a PDF with various annotations in OA (sticky note, highlight, lines, boxes, etc) and then save the burned in PDF down to your local.
  2. View it in Acrobat - all annotations will display
  3. View it in PDF.js (https://mozilla.github.io/pdf.js/web/viewer.html) - annotations will not display
  4. Now, open the PDF in Acrobat and save the PDF to a new file.
  5. View the newly craeted file in PDF.js - All annotations will display

So - I think the problem may not necessarily be with PDF.js, but rather with the way we're burning in the annotations.

@benallenallen - feel free to reply with any thoughts / ideas...

gsteimer commented 8 years ago

Lowering the priority of this since we now have OA integration in the HPI stage.