Closed robertknight closed 5 years ago
Assigning this to myself since I'd already started looking into solutions.
To reproduce the issue with the client and Via locally more easily, apply the following diff to Via:
diff --git a/static/pdfjs/web/viewer.js b/static/pdfjs/web/viewer.js
index ca12733..4d501ab 100644
--- a/static/pdfjs/web/viewer.js
+++ b/static/pdfjs/web/viewer.js
@@ -4304,9 +4304,14 @@ var PDFViewer = (function pdfViewer() {
var firstPagePromise = pdfDocument.getPage(1);
this.firstPagePromise = firstPagePromise;
+ const delay = ms => new Promise(resolve => setTimeout(resolve, ms));
+
// Fetch a single page so we can get a viewport that will be the default
// viewport for all pages
- return firstPagePromise.then(function(pdfPage) {
+ return firstPagePromise.then(async function(pdfPage) {
+ // TESTING
+ await delay(500);
+
var scale = this._currentScale || 1.0;
var viewport = pdfPage.getViewport(scale * CSS_UNITS);
for (var pageNum = 1; pageNum <= pagesCount; ++pageNum) {
Note that this will cause a couple of unrelated errors in the console and the PDF page will not render immediately until you scroll it a bit, but it demonstrates the orphaning issue.
Steps to reproduce
Go to https://via.hypothes.is/https://www.nsqol.org/wp-content/uploads/2019/02/National-Standards-for-Quality-Online-Teaching.pdf and select the "Public" group
Expected result
There is an annotation on the last page (https://hypothes.is/a/LJMxNsjFEem55S9pNtX0Aw). It should always anchor.
Actual result
Testing in Chrome 78.0.3887.7, I see the annotation fail to anchor once every few attempts.
Inspecting the store state in the sidebar showed that no anchoring timeout had occurred but instead the annotation had just failed to anchor:
Adding a breakpoint in the client code in
annotator.bundle.js
where anchoring failures are handled (thecatch
clause in response to a failedanchor
- see code), I got the following exception:Which points to
pdfPage
beingundefined
here: https://github.com/hypothesis/client/blob/4dfebbd6afbd415cd0e7858e0a86ee899f4bed32/src/annotator/anchoring/pdf.js#L76Looking at the implementation of the
getPageView
method in PDF.js which returns the object that is expected to have apdfPage
property, it looks like there are indeed certain states when that method could return undefined. Reading through the implementation, it looks like we should be usingpdfDocument.getPage
whenever we want a reference to a page, which returns aPromise<PDFPageProxy>
on which we can callgetTextContent
directly.