bandantonio / obsidian-apple-books-highlights-plugin

Import highlights and notes from your Apple Books to Obsidian
https://obsidian.md/plugins?id=apple-books-import-highlights
MIT License
51 stars 2 forks source link

[Bug]: 1.5.0 imports only a subset of highlight files #44

Open banj opened 3 months ago

banj commented 3 months ago

Bug Report Checklist

Plugin version

1.5.0

Obsidian version

1.6.7

macOS version

macOS Sonoma

What happened?

Installed the 1.5.0 update, clicked the button to trigger an import; new popup appears, "will overwrite all books", clicked yes. Import only brought in 18 highlight files, where before there were 100+

The issue occurs when

Importing all books

How many books do you have in your Apple Books library?

100+

Code of Conduct

banj commented 3 months ago

I checked the developer console for error messages and didn't find anything

bandantonio commented 3 months ago

@banj I made some tests on Sonoma and Sequoia with my books (mysteriously, currently, I have 18 books with highlights), and there is a strange glitch with the setting responsible for syncing highlights across devices. As a result, highlights are not synced to the "main" database and the plugin cannot see pull them. I can see that the list of books when importing a single book is not complete as well. So, my assumption is that the problem is not related to the plugin itself. But I will investigate further a bit later.

I managed to find a temporary workaround: go to Books - Settings - General tab - Syncing section - re-enable Collections, bookmarks, and highlights checkbox. This will force Books to re-check your collection and all the “missing” entries appear in the database. As soon as these entries appear, the plugin immediately retrieves them.

banj commented 3 months ago

Thanks for looking into it - funny coincidence about the book count! All the highlights were syncing with 1.4.0 immediately prior to updating, but I did go ahead and uncheck. exit, then open and re-check the sync setting you specified. No change after 15 minutes but I'll give it a bit more time and try again.

The really odd thing is that there's no obvious pattern to the 18 books that do have highlights syncing, they span the alphabet (so not the first 18 matching), many haven't been recently opened, etc.

banj commented 3 months ago

To add on: I went into a book that had no highlights and added one, then re-triggered the import, and it didn't come through. Under the previous version I did this a few times and they came through immediately.

bandantonio commented 3 months ago

To add on: I went into a book that had no highlights and added one, then re-triggered the import, and it didn't come through. Under the previous version I did this a few times and they came through immediately.

That's the same behavior I'm observing on my side, and as long as the bulk import code wasn't affected during the new release, I'm slightly confident that the root cause is not the plugin itself (but it's pretty much strange behavior, indeed).

You can also try to download artifacts from the 1.4.0 release and place them into the plugin's folder of your obsidian vault. In my case, the behavior hasn't changed, and if you confirm that the version 1.4.0 produces the same output, the root cause is not the plugin itself. This issue would probably be a good idea to introduce some kind of a debug mode.

banj commented 3 months ago

Great idea, I downloaded the main.js, manifest.json, and style.css assets and overwrote the 1.5.0 files in the obsidian vault. When I restarted obsidian I could see that the version had rolled back to 1.4.0, and I triggered the import; there was no popup for overwriting that you added in 1.5.0, confirming it rolled back to the previous version of the methods. The import brought all the highlight files (195 total) over successfully.

bandantonio commented 3 months ago

A-ha, got it, thank you for confirming this. I'm further investigating this behavior.

bandantonio commented 3 months ago

Hi @banj. Some updates from my side - I still can't fully reproduce the issue you're struggling with.

I was testing the plugin on both Sonoma and Sequoia, trying versions 1.4.0 and 1.5.0. No issues. There is a short delay when syncing new highlights from iPhone or iPad (that was my concern), but they are still getting added to the database after a while and the plugin can fetch them. From Mac all the new highlights are synced instantly.

Still buried in debugging... 🫠

banj commented 2 months ago

Ugh, let me know if there's anything I can do to support debugging. I had a look at the commit and at first I thought that either the backup logic or the loop (for (const combinedHighlight of highlights)) saving the highlights might be the cause-- then I was having a look at the two new methods that SaveHighlightsToVault method was refactored into, but I haven't come up with anything helpful

spoof commented 2 months ago

I got the same issue - not all highlights has been imported (and some books are missing). I've dug a bit into developer console and I got this error:

TypeError: Cannot read properties of null (reading 'map')
    at highlightLocationToNumber (plugin:apple-books-import-highlights:6470:107)
    at eval (plugin:apple-books-import-highlights:6458:48)

I believe it comes from this function: highlightLocationToNumber. Something wrong with regexp.

Location that causes error: epubcfi(/6/16[chapter03]!/4/2/156/4/1:15)

When I change sorting setting to 'By creation date' I got all my highlights and books back again.

bandantonio commented 2 months ago

@spoof Thank you for the investigation, it shed some light to this case.

You received the error because the regex expects the location to point to a range of characters. However, it seems that the location you provided points to a single character, that's why the whole sequence fails and match returns null.

  1. Could you please confirm that you have a single character highlight somewhere in your books? If no, what is the title of the book this location comes from? Seems like I need to inspect it, because I can't reproduce it with my books collection.
  2. You said that changing the sorting back to 'By creation date' (which one?) fixed the issue. However, 'By creation date (from oldest to newest)' is the default setting that is populated after the plugin installation. What was the previous value set for this setting?

@banj May the p.1 be related to your case as well?

banj commented 2 months ago

I probably have a few typo highlights in my books collections, and I also have a habit of highlighting across paragraph breaks which makes it appear as separate highlights-- but I'm not sure of a good way to figure out where that might be. I notice that prior to v1.5 the full collection imports, so does that mean that the previous version could handle single character highlights?

I leave the sorting selection on the default setting

mbarritt commented 2 months ago

Been away from Obsidian for a couple weeks and just catching up with this now. For me import now stops at 16 books. Nothing useful to add to what's been said above except to provide another confirmation of the error. None of my new/most recently highlighted books are in the 16 which still import, which implies that it is failing on something that had been there previously (and did import previously).

bandantonio commented 2 months ago

I notice that prior to v1.5 the full collection imports, so does that mean that the previous version could handle single character highlights?

@banj Well, not quite. The v1.5 introduced an intermediate mechanism (which is mandatory) to handle the order of rendering highlights. Depending on the selected sorting criterion, this mechanism works with either highlight dates or locations and returns the desired order of highlights. Versions before v1.5 don't have such a mechanism and highlights were rendered "as is" according to how they are stored in the database. There was a related issue (#19) about that.

As @spoof showed the location that potentially caused an error, and that made me think about situations on how and when Apple Books could generate such a location. During my research some of the sources pointed out that single character highlights may generate such a location identifier. So that was my initial assumption backed up by the fact the this corner case is indeed not handled by the plugin. However, testing v1.5 on my collection of books didn't confirm that, even single character highlights have the expected location identifiers. So, I just wanted to check up with you to understand whether or not this is related to your case.

P.S: my sincere apologies for such a long time figuring out the root cause of this bug, my spare time is limited now but besides that I just want to properly identify the root cause and come up with a reliable solution.

@mbarritt Got it, thank you for the information. No errors in the console? Is sorting criterion in its default setting (creation date, from oldest to newest)?