hypothesis / h

Annotate with anyone, anywhere.
https://hypothes.is/
BSD 2-Clause "Simplified" License
2.94k stars 426 forks source link

Annotation Issues with Large PDFs in Calibre-Web #8171

Open coderZoe opened 1 year ago

coderZoe commented 1 year ago

Hello Hypothesis team,

I've been utilizing hypothes.is for annotations within Calibre-Web, an e-book library project. While it functions effectively for EPUB format, I'm encountering an issue with larger PDFs.

Here's how the annotation appears for EPUB: 166a5a1c3d7776fa355b32a7aa3cdd1 f7037dca5aa36619f719bff80c13ce6

However, for larger PDFs, the experience is different: 198eedaa09ca0edae192a1f3904b852 ba02d61b44734fe747f1486bf7ed066

Upon selecting text in a PDF and attempting to use the hypothes.is Annotation button, it fails to display the expected text box for annotations, which is seen with EPUB. Interestingly, this problem is prominent with larger PDFs (e.g., several hundred pages). For smaller PDFs, roughly around ten pages, the annotation feature works as anticipated.

For reference, I've been using the Microsoft Edge browser with the hypothes.is plugin sourced from: https://chrome.google.com/webstore/detail/hypothesis-web-pdf-annota/bjfhmglciegochdpefhhlphglcehbmek.

Would appreciate any assistance or insights on this matter. I'm eager to understand whether this is a known issue or if there's a potential solution at hand.

Thank you for your time.

robertknight commented 4 months ago

Hello. Apologies for the late response. I have a few follow-up questions:

One possible issue in large PDFs might be to do with computing the position number that is used to sort annotations in the sidebar. This currently involves extracting the text from every page up to the page which you annotated. If this is indeed the issue, then you'd find that annotating pages near the top of the document should work, but annotating pages much later may encounter a delay.