mozilla / pdf.js

PDF Reader in JavaScript
https://mozilla.github.io/pdf.js/
Apache License 2.0
48.57k stars 9.99k forks source link

Hyperlinks not working #6271

Closed apuder closed 9 years ago

apuder commented 9 years ago

I'm using pdf.js via the ViewerJS project and I've noticed that hyperlinks are not working. I see some issues mention this and I've also come across this link here: http://jsfiddle.net/seikichi/RuDvz/2/ I am surprised that this does not seem to have been addressed. I use PowerPoint to create slides and then create the PDFs via "Save as PDF". Before I invest time integrating the above workaround into ViewerJS I wanted to make sure I'm not missing anything. TIA.

timvandermeij commented 9 years ago

Do you have a PDF file for us that has the issue? Could you also test that file with https://mozilla.github.io/pdf.js/web/viewer.html (Open File button in the toolbar) as I don't know which version of PDF.js ViewerJS uses?

timvandermeij commented 9 years ago

I'm going to assume that you are talking about http://android-tutorials.info/doc/slides.pdf. I have looked into that PDF and while it seems that there is a hyperlink, the PDF actually contains no annotation dictionary required for rendering a clickable hyperlink. Adobe Reader/Acrobat infers that it is a hyperlink, PDF.js does not. I'm not sure if we will be able to solve this, as inferring if we are dealing with a hyperlink is not easy because of the way a PDF file is structured.

Somehow I feel like that PDF has not been exported properly. I have almost never seen such PDFs, as all other PDFs with hyperlinks have corresponding /Annotation dictionaries for them.

Marking as a feature for now; you might want to consider using another way to save the PDF or to look into the export options to make sure that annotations are properly created.

By the way, PDF.js does support hyperlinks if they have proper /Annotation dictionaries, such as http://www.antennahouse.com/XSLsample/pdf/sample-link_1.pdf. All PDFs I have come across have this, except for your PDF somehow.

Snuffleupagus commented 9 years ago

If the assumption in https://github.com/mozilla/pdf.js/issues/6271#issuecomment-124979601 is correct, then this issue should be a duplicate of #3172.

timvandermeij commented 9 years ago

Correct. This is a duplicate of #3172 and the other issues referenced over there. The cause is the same, namely that text styled as a link is not actually a Link annotation. Closing as such and continuing tracking of this issue over there.

apuder commented 9 years ago

Thanks for your feedback, @timvandermeij you found the PDF I was talking about. Just one quick comment: here is another PDF from the same site: http://android-tutorials.info/slides/1549335147a985a9bb291351eb84e94f-slides.pdf

Check out slide 5, there are hyperlinks that cannot be inferred from the text. I am not familiar with the PDF format but I would imagine the link has to be stored in the document. Viewing this PDF in any other PDF viewer I am able to click on the link, but not pdf.js, so I don't see this related to issue #3172. I also tried different ways to create the PDF, including some online converter. None create a PDF where pdf.js recognizes the link.

timvandermeij commented 9 years ago

Clicking those links works for me with the most recent version of PDF.js at https://mozilla.github.io/pdf.js/web/viewer.html (use the Open File button in the toolbar). ViewerJS is apparantly using a very old version of PDF.js, so you might want to update that.

apuder commented 9 years ago

Just as a followup to put closure to this topic: I updated ViewerJS as you suggested but it still does not work. I'm not sure what ViewerJS is doing but I was not in the mood to investigate. What I ended up doing instead is to use your fine viewer that is bundled with pdf.js. Thanks for the nice work! Just one minor point you might want to look into. Visit http://android-tutorials.info/ and hover with the mouse over the URL on the bottom. It redirects to the correct URL, however, there seems to be something wrong with the link-highlighting.

timvandermeij commented 9 years ago

@apuder Actually that is exactly what is in the PDF file. You can verify that by opening the file with https://brendandahl.github.io/pdf.js.utils/browser/ and going to Root -> Pages -> Kids -> 0 -> Annots. You will see three link annotations, which are exactly the three parts of the URL you see when hovering the link. In order words, your PDF generator (PowerPoint) has actually split that one link into three parts. This is not a bug in PDF.js, but rather it is exactly what is in the file. You probably do not see this with Adobe Reader because Adobe Reader does not highlight the links, thereby obscuring that fact.