tnahs / readstor

A CLI for Apple Books annotations
https://tnahs.github.io/readstor/
Apache License 2.0
16 stars 1 forks source link

Highlights in pdfs not being exported #1

Closed sent-hil closed 2 years ago

sent-hil commented 2 years ago

Hi,

Thank you for open sourcing this repo. I created few different highlights and so far it seems like highlights in pdf files imported to Apple books aren't showing up in exports. I opened the document on desktop and I can see the highlights in there. The two epub book seem to work.

Thanks.

tnahs commented 2 years ago

Hey, thanks for the issue. Not sure what the best solution for this is.

Essentially, ReadStor just queries Apple Books's database and processes the data. However, PDF annotations are stored inside the actual PDF and not in the Apple Books database. So there's no quick way to access them. It would possible to grab the path of the PDF from the database and then extract the annotations from the actual file. And there seems to be a crate lopdf, that could potentially handle this kind of thing.

Not sure if I've been doing something wrong but this use case never occurred to me because I've never been able to modify (aka highlight) a PDF that was imported into and read from Apple Books... I'll have to do some tests to see if this has been fixed/changed.

sent-hil commented 2 years ago

Ah that explains it. I poked around a bit and found https://github.com/0xabu/pdfannots which is able to read the annotations in the pdfs. I can combine yours with that and build a space repetition system which was my original intention of looking at your repo, so thanks :)

tnahs commented 2 years ago

Tried highlighting a PDF I imported into Apple Books and got the error: "The original document can't be changed, so a duplicate with your changes has been created."

screenshot-2022-01-11-084203

Seems like Apple Books keeps a read-only version of the PDF. The highlighted changes must be saved to a different file.

Closing this until this changes in the future.