mozilla / pdf.js

PDF Reader in JavaScript
https://mozilla.github.io/pdf.js/
Apache License 2.0
48.14k stars 9.95k forks source link

Link text could not be selected #18266

Closed ydwatibm closed 2 months ago

ydwatibm commented 3 months ago

Attach (recommended) or Link to PDF file here: hyperlink_text_selection.pdf

Configuration:

Steps to reproduce the problem:

  1. Open the attached pdf file named hyperlink_text_selection.pdf
  2. Try to select the text "Microsoft"

What is the expected behavior? The text "Microsoft" should be selected

What went wrong? (add screenshot) image

Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension):

alexcat3 commented 2 months ago

I'm not sure about this but I have a suspicion of what's going on here. If I'm correct, the root issue actually goes beyond links. Try selecting any text in a PDF in pdf.js, starting your selection from a decent margin away from the letters so that you are beyond the bounds of the <span> holding the text in the text layer that pdf.js generates. Your selection will end up starting at the beginning of the page, regardless how far down you start! This is already an annoying behavior but you can normally get around it by starting your selection close enough to the text that you want to select that it ends up within the bounds of the <span> holding the text. The issue with the links in this PDF when viewed in PDF.js is that the <section> in the annotation layer containing the link completely overlaps the <span> that contains the text so that it's not possible to start a selection from inside the text <span> without triggering the link.

alexcat3 commented 2 months ago

Oddly, while the microsoft link can't be selected by itself in either the version of PDF.js in Firefox 127.0.2 or the latest code in master, the thing that anchors the selection and causes it to start at the top of the page seems to be different. In Firefox 127.0.2 the selection is anchored on the text layer and on the latest code in master it is anchored on the annotation layer.

nicolo-ribaudo commented 2 months ago

I was thinking about how to solve this: adding user-select: none to links in the annotation layer significantly improves the situation. However I believe that the best fix would be to move the links to the .textLayer itself. By doing so, the links are in the right place and we don't need any workaround in the first place.

Any opinion?

Snuffleupagus commented 2 months ago

However I believe that the best fix would be to move the links to the .textLayer itself. By doing so, the links are in the right place and we don't need any workaround in the first place.

To me that sounds potentially quite "messy" and thus undesirable, since at a PDF specification level the annotations are completely separate from the text-content. It may also be further complicated by the fact that some PDF documents don't contain any text and consequently there's no textLayer, and it's possible to disable the various layers independently of each other (which is functionality we should keep).