box / box-content-preview

JavaScript library for rendering files stored on Box
https://developer.box.com/docs/box-content-preview
Other
106 stars 113 forks source link

fix(copy-paste-issue): PDFJS copy paste fix #1489

Closed JChan106 closed 1 year ago

JChan106 commented 1 year ago

We noticed when upgrading to PDFJS v3.6.172, copy/paste did not work for text on pdfs. This is because in this new PDFJS version, the enum we were using to pass into the textLayerMode, had changed from ENABLE_ENHANCE to ENABLE_PERMISSION. In previous PDFJS versions, ENABLE_ENHANCE would be an option to enable textLayerMode and enhance the text selection (improve text selection across multiple lines). However they have deprecated this option as of v3.0.279 as noted here:

They have replaced this enum with a new option called ENABLE_PERMISSION which if passed in for textLayerMode, checks the permission flags set for the PDF document. So if the PDF document doesn't have the COPY flag set, and ENABLE_PERMISSION is passed in, the user will not be able to copy the document. This is discussed here:

Because ENABLE_PERMISSION uses the same enum as the deprecated ENABLE_ENHANCE, we were using ENABLE_PERMISSION mistakenly when upgrading versions. To solve this, we can just use the ENABLE enum instead, which doesn't check for the documents permissions. We use this enum for mobile and it is also what PDFJS uses as the default textLayerMode as well, so I don't believe there's much risk.

The Enhanced Text Selection seems to be the drawback to this change, i'm not sure if the pdfjs team has plans to bring it back, but this comment mentioned that "in general it's often too slow to be usable in practice".

2023-07-10 16 50 02

jstoffan commented 1 year ago

@JChan106, this seems like a pretty meaningful downgrade in functionality. We may need to retrofit our own solution to avoid this being seen as a regression. What effect does it have on the creation of text highlight annotations?

JChan106 commented 1 year ago

@JChan106, this seems like a pretty meaningful downgrade in functionality. We may need to retrofit our own solution to avoid this being seen as a regression. What effect does it have on the creation of text highlight annotations?

Good point, I've been testing the differences between enhanced text selection and normal text selection and I think we may have to retrofit. When selecting multiple elements with large spaces between them, It seems like the enhanced text selection shows the selection through the space. While as normal text selection only shows the selection of the actual elements.

Normal: image

Enhanced: image

The actual content selected seems to be the same, however if we want the same look and experience that users of box-content-preview are used to, we must retrofit.

JChan106 commented 1 year ago

Sent an issue to the pdfjs team regarding the enhanced text selection removal: https://github.com/mozilla/pdf.js/issues/16684. Will update this PR once we get a response.