hypothesis / client

The Hypothesis web-based annotation client.
Other
637 stars 197 forks source link

PDFs with selectable text in new version of PDF.js that fail in the LMS app and in the Client #3688

Closed mkdir-washington-edu closed 3 years ago

mkdir-washington-edu commented 3 years ago

From: https://hypothes-is.slack.com/archives/C2BLQDKHA/p1629189510271400?thread_ts=1629132387.246500&cid=C2BLQDKHA

These are two PDFs that have selectable text in the newest version of PDF.js, but are missing some text in the version of PDF.js we're using in the client and the LMS app.

PDFs are here: https://hypothes-is.slack.com/archives/C2BLQDKHA/p1629189510271400?thread_ts=1629132387.246500&cid=C2BLQDKHA

Also in Canvas assignments here: https://hypothesis.instructure.com/courses/92/assignments (course is Support 102)

PDF "Alexis the Materia lculture of writing" - see pages 86-87, 92 - 93 PDF "Ong new" - see pages 26-29

Steps to reproduce

1.Open the PDF in the browser and activate the client (or open the LMS assignment)

  1. Select all
  2. See pages indicated
  3. Then open the PDFs in Firefox
  4. Select all
  5. See pages indicated
  6. Note that the PDFs also appear fully selectable in Chrome's PDF viewer and in Acrobat

Expected behaviour

Users will use their browsers and Adobe products to test the selectability of text before trying it in Hypothesis. Our environment's selection of text should match other platforms.

Actual behaviour

See indicated pages above for unselectable text.

robertknight commented 3 years ago

PDF "Alexis the Materia lculture of writing" - see pages 86-87, 92 - 93

These pages can be annotated when using PDF.js v2.11.106. Page 92 is hard to annotate even in the newest PDF.js release because the page was not flat on the scanner when it was OCR-ed and so is distorted. Page 93 is also partially obscured for the same reason.

PDF "Ong new" - see pages 26-29

These pages look OK to me in PDF.js v2.11.106 as well.

robertknight commented 3 years ago

We've deployed the new version of PDF.js in the extension and Via, so I'm going to close this.