Closed strk closed 2 months ago
Thank you for this! You mentioned it took all morning. I would have expected it to take much longer. Could you explain how you did it? any tricks to get it done in a timely fashion? I could have a pop at one of the other unindexed books.
I'm pretty fast with keyboard and "vi" editor :) How I did:
To help with the iterations I wrote a short perl script "offset.pl" that took a single argument (a signed offset) and would change both starting and ending pages by summing up that offset.
Common reasons for page mismatches:
Please include the md5sum of your source PDF when you contribute the updated index.
Awesome! Great idea about checksums too. We should incorporate checksums into the repo, allowing for the fact that the same real book could have multiple PDFs.
And maybe number of pages too, as changes in resolution or reflow would change checksum while keeping number of pages unchanged
On July 10, 2024 9:49:55 PM GMT+02:00, Adam Spiers @.***> wrote:
Awesome! Great idea about checksums too. We should incorporate checksums into the repo, allowing for the fact that the same real book could have multiple PDFs.
-- Sent from hand-held device with K-9 Mail. Please excuse my brevity.
I'm travelling right now, any chance you could file an issue for those ideas?
I'm travelling right now, any chance you could file an issue for those ideas?
It took me the whole morning to extract from:
99f2d4fb6e25fca77a753e86430098a1 TheNewRealBookVol1-Bb.pdf (md5sum)