Since the beginning stages of the PDF Component Library, there has been some confusion on whether we reference PDF pages by pageIndex (0-indexed) or pageNumber (1-indexed). PDF.js often seems to prefer pageNumber, but will sometimes accept either pageNumber or pageIndex, which adds to the confusion.
My goal is to reduce ambiguity by moving forward with pageIndex and removing references to pageNumber where possible. This was successful for the following:
HighlightOverlay component, demo, and unit tests
PageWrapper component
scroll utils
However, there were two places where it was not possible to entirely diverge from pageNumber, and in these situations, I attempted to minimize its usage:
DocumentWrapper library component- when the PDF loads successfully, DocumentWrapper grabs the first page of the PDF document by calling getPage(pageNumber). This reference to pageNumber is unavoidable because PDFjs doesn't provide a getPage(pageIndex) function. I pulled this out into a getFirstPage() function to keep it readable and self-contained.
Outline Reader/example component- This is built on top of the Outline component provided by react-pdf. The Outline click handler passes us a pageNumber but not a pageIndex. To solve this, I added a small function to convert pageNumber to pageIndex, then we pass pageIndex on to subsequent functions.
Summary of changes in this PR:
Update HighlightOverlay component to take pageIndex instead of pageNumber. Update HighlightOverlayDemo and relevant unit tests to reflect changes.
Update PageWrapper component to remove references to pageNumber
Update scroll util to use pageIndex instead of pageNumber
Update DocumentWrapper to abstract out use of pageNumber
Update Outline to convert pageNumber to pageIndex before calling scroll function
Removed TODOs from TextHighlightDemo and ScrollToDemo
Testing Plan
Existing functionality works as expected: zoom in/out, rotate CW/CCW, outline (open, scroll to page, close), highlight overlay demo, text highlight demo, scroll to demo
Description
This covers https://github.com/allenai/scholar/issues/29449 which is a subtask of the PDF Loose Ends ticket.
Since the beginning stages of the PDF Component Library, there has been some confusion on whether we reference PDF pages by
pageIndex
(0-indexed) orpageNumber
(1-indexed).PDF.js
often seems to preferpageNumber
, but will sometimes accept eitherpageNumber
orpageIndex
, which adds to the confusion.My goal is to reduce ambiguity by moving forward with
pageIndex
and removing references topageNumber
where possible. This was successful for the following:HighlightOverlay
component, demo, and unit testsPageWrapper
componentscroll
utilsHowever, there were two places where it was not possible to entirely diverge from
pageNumber
, and in these situations, I attempted to minimize its usage:DocumentWrapper
library component- when the PDF loads successfully,DocumentWrapper
grabs the first page of the PDF document by callinggetPage(pageNumber)
. This reference topageNumber
is unavoidable because PDFjs doesn't provide agetPage(pageIndex)
function. I pulled this out into agetFirstPage()
function to keep it readable and self-contained.Outline
Reader/example component- This is built on top of theOutline
component provided byreact-pdf
. TheOutline
click handler passes us apageNumber
but not apageIndex
. To solve this, I added a small function to convertpageNumber
topageIndex
, then we passpageIndex
on to subsequent functions.Summary of changes in this PR:
HighlightOverlay
component to takepageIndex
instead ofpageNumber
. UpdateHighlightOverlayDemo
and relevant unit tests to reflect changes.PageWrapper
component to remove references topageNumber
scroll
util to usepageIndex
instead ofpageNumber
DocumentWrapper
to abstract out use ofpageNumber
Outline
to convertpageNumber
topageIndex
before calling scroll functionTextHighlightDemo
andScrollToDemo
Testing Plan
Screencast for sanity check even though nothing has changed from a UI perspective The GIF I created isn't showing up in my preview so here's a link as an alternative.