tnahs / readstor

A CLI for Apple Books annotations
https://tnahs.github.io/readstor/
Apache License 2.0
16 stars 1 forks source link

Update by ZANNOTATIONLOCATION instead? #2

Closed sent-hil closed 2 years ago

sent-hil commented 2 years ago

Hi,

It looks like the exported highlights are randomly ordered. I think exporting them in the order they appear in the book is a better choice.

I believe if you change this line to order by ZANNOTATIONLOCATION it would solve it.

I'm not a Rust programmer, figured it would be easier for you than me :) If this isn't in scope, let me know and I can fork it.

Thanks!

tnahs commented 2 years ago

The annotations are not actually sorted in the SQL query. Also ZANNOTATIONLOCATION is an epubcfi which comes in as a string and needs to be parsed and then used for sorting.

See: here, here and here

The annotations should be appearing in the correct order. I haven't written an integration test to check for this but there are a bunch of unit tests in the parser. If you could double check the order and let me know. I'll have to investigate.

sent-hil commented 2 years ago

image

Yep double checked it. If you look at the screenshot above, the right side is from Books app and the left is from output of readstor -t default.txt -o . -f.

tnahs commented 2 years ago

Oh weird. The order should be correct in terms of the epubcfi based on the unit tests but I wanna double check. If not, then there's a bug somewhere else...

Would you mind sending me the epubcfis of all those entries and what order they should be in? You can find it in the annotations.json file in the data directory. It listed under metadata:location for each annotation.

sent-hil commented 2 years ago

Here you go:

[
  "epubcfi(/6/16[x01-Logic-1]!/4[x01-Logic-1]/2[_idContainer008]/20,/1:0,/5:1)",
  "epubcfi(/6/16[x01-Logic-1]!/4[x01-Logic-1]/2[_idContainer008]/52,/1:0,/5:124)",
  "epubcfi(/6/16[x01-Logic-1]!/4[x01-Logic-1]/2[_idContainer008]/56,/1:0,/7:161)",
  "epubcfi(/6/16[x01-Logic-1]!/4[x01-Logic-1]/2[_idContainer008]/64,/1:0,/3:58)",
  "epubcfi(/6/16[x01-Logic-1]!/4[x01-Logic-1]/2[_idContainer008]/68/2,/2/1:0,/3:424)",
  "epubcfi(/6/16[x01-Logic-1]!/4[x01-Logic-1]/2[_idContainer008]/68/4,/2/1:0,/3:335)",
  "epubcfi(/6/16[x01-Logic-1]!/4[x01-Logic-1]/2[_idContainer008]/68/6,/2/1:0,/5:141)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/104,/1:0,/3:93)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/144/2,/2/1:0,/3:144)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/144/4,/2/1:0,/3:143)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/144/6,/2/1:0,/9:127)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/156,/1:0,/3:1)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/166/2,/2/1:0,/5:262)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/166/4,/2/1:0,/3:159)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/166/6,/2/1:0,/3:114)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/172/1,:0,:233)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/18,/2/1:0,/3:18)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/184/1,:0,:146)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014],/194/1:0,/196/1:159)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/36,/1:117,/3:1)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/56/1,:0,:615)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/60/1,:0,:304)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/72,/1:0,/3:1)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014],/8/2/1:0,/12)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014],/84/1:0,/86/2/1:144)",
  "epubcfi(/6/18[x02-Boolean_Algebra]!/4[x02-Boolean_Algebra]/2[_idContainer014]/90/1,:0,:923)",
  "epubcfi(/6/22[x03a_Binary_Primer-1]!/4[x03a_Binary_Primer-1]/2[_idContainer024]/26,/1:118,/7:44)",
  "epubcfi(/6/22[x03a_Binary_Primer-1]!/4[x03a_Binary_Primer-1]/2[_idContainer024]/56/1,:0,:311)",
  "epubcfi(/6/24[x04-Mechanical_Computers]!/4[x04-Mechanical_Computers]/2[_idContainer030],/8/1:0,/10/1:208)",
  "epubcfi(/6/26[x05-Symbols_and_Circuits_copy]!/4[x05-Symbols_and_Circuits_copy]/2[_idContainer042]/28/1,:153,:393)",
  "epubcfi(/6/26[x05-Symbols_and_Circuits_copy]!/4[x05-Symbols_and_Circuits_copy]/2[_idContainer042],/30/1:0,/32)"
]

This is the order in Apple books:

image image

tnahs commented 2 years ago

Thanks so much for this!

I can use these epubcfis but I think it'd be much more useful if I can get a sample of each epubcfis annotation too. Just the first sentence or so so I can compare the order I get to the original order. Right now I'm not sure which epubcfi corresponds to which annotation.

tnahs commented 2 years ago

Tested all the epubcfis and found no anomalies. I need the original ordering to really run a test. @sent-hil, If it's easier, could you just share what book this is? I can get a copy and write some tests with it.