Open mkdir-washington-edu opened 3 years ago
I tried to force OCR this document using the docdrop OCR tool to see if the resulting document had similar issues. It's been 30 minutes and docdrop hasn't finished processing the file.
I will add the file and the result of tests in a comment once I am able.
Seems like no selectable text in Safari 13.0.5.
All text is selectable in Chrome 89.0. FYI we've seen PDFs in the past that had selectable text in Chrome but not in PDF.js (both Firefox and the LMS app), which is why we typically test PDFs in Firefox.
Chrome selectable text:
DocDrop Force OCR option doesn't work on this file.
Exporting the file to image files, recombining them to a PDF and the OCRing does work; the selectable text is present in Firefox and the LMS app. However, this isn't a useful solution for users.
Added to our bug & product backlog as a Spike - a good outcome of that Spike would be:
Another problem PDF according to the same instructor should you need more examples. Offen and Steinbach combined for hypothesis assignment.pdf
And here's an example with the added "Read here" text in red that does work properly in both Firefox and the LMS app, in case a comparison is needed. Combined pr sources - nuremberg & mass shooting.pdf
Desired outcome of the Spike is:
Note: it is possible to run a version of pdf.js locally which matches the version of pdf.js that we serve with the LMS app. It's beyond the capabilities of the support team, though.
Describe the bug A user has provided a PDF that is fully selectable in PDF.js in Firefox, but has a broken text layer (very little selectable text, sporadically arranged throughout the page) when viewed in the LMS app.
Select all in Firefox on the first page:
Select all in the LMS app on the first page:
To Reproduce Steps to reproduce the behavior:
Expected behavior While there is occasionally a difference in the selectable text available in Firefox and Chrome, in the past Firefox has been a good way for instructors to test a PDF before trying it out in the LMS app.
Screenshots Firefox console:
LMS app Console in Chrome:
Desktop (please complete the following information):
PDF file Clare Goll & DH Lawrence combined for Hypothesis.pdf