threeplanetssoftware / apple_cloud_notes_parser

Parser for Apple Notes data stored on the Cloud as seen on Apple handsets
MIT License
410 stars 26 forks source link

Crash in AppleBackup.back_up_file #40

Closed huyz closed 2 years ago

huyz commented 2 years ago

I'm running ruby notes_cloud_ripper.rb --mac ~/Library/Group\ Containers/group.com.apple.notes/ for the first time and getting:

❯ /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/3.1.0/pathname.rb:353:in `initialize': no implicit conversion of nil into String (TypeError)
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/3.1.0/pathname.rb:353:in `new'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/3.1.0/pathname.rb:353:in `+'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleBackup.rb:122:in `back_up_file'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNotesEmbeddedPublicJpeg.rb:31:in `initialize'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNote.rb:211:in `new'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNote.rb:211:in `block (2 levels) in replace_embedded_objects'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/sqlite3-1.4.4/lib/sqlite3/database.rb:204:in `block (2 levels) in execute'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/sqlite3-1.4.4/lib/sqlite3/resultset.rb:134:in `each'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/sqlite3-1.4.4/lib/sqlite3/database.rb:203:in `block in execute'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/sqlite3-1.4.4/lib/sqlite3/database.rb:156:in `prepare'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/sqlite3-1.4.4/lib/sqlite3/database.rb:198:in `execute'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNote.rb:173:in `block in replace_embedded_objects'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNote.rb:153:in `each'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNote.rb:153:in `replace_embedded_objects'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNote.rb:114:in `initialize'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNoteStore.rb:615:in `new'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNoteStore.rb:615:in `block in rip_note'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/sqlite3-1.4.4/lib/sqlite3/database.rb:204:in `block (2 levels) in execute'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/sqlite3-1.4.4/lib/sqlite3/resultset.rb:134:in `each'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/sqlite3-1.4.4/lib/sqlite3/database.rb:203:in `block in execute'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/sqlite3-1.4.4/lib/sqlite3/database.rb:156:in `prepare'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/sqlite3-1.4.4/lib/sqlite3/database.rb:198:in `execute'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNoteStore.rb:605:in `rip_note'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNoteStore.rb:507:in `block in rip_notes'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/sqlite3-1.4.4/lib/sqlite3/database.rb:204:in `block (2 levels) in execute'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/sqlite3-1.4.4/lib/sqlite3/resultset.rb:134:in `each'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/sqlite3-1.4.4/lib/sqlite3/database.rb:203:in `block in execute'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/sqlite3-1.4.4/lib/sqlite3/database.rb:156:in `prepare'
    from /Users/huyz/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/sqlite3-1.4.4/lib/sqlite3/database.rb:198:in `execute'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNoteStore.rb:506:in `rip_notes'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNoteStore.rb:193:in `rip_all_objects'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleBackup.rb:159:in `block in rip_notes'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleBackup.rb:157:in `each'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleBackup.rb:157:in `rip_notes'
    from notes_cloud_ripper.rb:128:in `<main>'

Looking at the logs, it seems the process was able to work through a bunch of notes and Embedded Object of type public.jpeg but it seems to crash on one of them:

D, [2022-08-01T11:41:21.000382 #54006] DEBUG -- : AppleNote: Note 903 replacing attachment 7513D337-3C2E-4C7F-B6D7-F379BF72171E
D, [2022-08-01T11:41:21.000402 #54006] DEBUG -- : Note 903: Created a new Embedded Object of type public.jpeg

Any idea what could be wrong?

threeplanetssoftware commented 2 years ago

This appears to be roughly the same as #35, thank you for giving another example. That ticket gives a possible workaround, but it doesn't explain why the issue occurs in the first place, so I don't exactly want to silently pass it by.

First off, are you running the latest version? I did a lot of testing before I pushed that, but I did refactor embedded object code, so this could be a new bug if embedded office code is throwing errors. Would you please check out commit 20965a55b8fcfda304cc6c7d2b3a3dd39229eab7 and see if that gets past this note. If so I've introduced a bug and that will certainly bound the debugging.

Secondly, could you look at this note and see if the image still exists (by that I mean, actually open up the notes application and eyeball it, maybe the image was deleted and there's some stale pointer). Along the same lines, is this note in the deleted items folder, or is it a real, live note?

Thirdly, tracing this through it seems like the issue is this line:

https://github.com/threeplanetssoftware/apple_cloud_notes_parser/blob/5d408fcb376fbb843a103ddd4c40f31fe70882dc/lib/AppleBackup.rb#L122

Specifically, the variable that is likely dying is filename_on_phone. Assuming this isn't password protected, that should be coming from:

https://github.com/threeplanetssoftware/apple_cloud_notes_parser/blob/5d408fcb376fbb843a103ddd4c40f31fe70882dc/lib/AppleNotesEmbeddedPublicJpeg.rb#L161

Two lines later there should be a debug being printed:

https://github.com/threeplanetssoftware/apple_cloud_notes_parser/blob/5d408fcb376fbb843a103ddd4c40f31fe70882dc/lib/AppleNotesEmbeddedPublicJpeg.rb#L163

Can you look at the debug_log.txt file and see:

  1. If any of those "Filename is..." lines are printed for AppleNotesEmbeddedPublicJpeg objects.
  2. If you can share all debug_log.txt file lines for this specific note (903)
  3. What specifically is the Type UTI for this object? The SQLite query to identify that would be:
SELECT ZTYPEUTI FROM ZICCLOUDSYNCINGOBJECT WHERE ZIDENTIFIER='7513D337-3C2E-4C7F-B6D7-F379BF72171E'
huyz commented 2 years ago

Thanks for your detailed reply!

First off, are you running the latest version

Yes, I just checked out the latest master

Would you please check out commit https://github.com/threeplanetssoftware/apple_cloud_notes_parser/commit/20965a55b8fcfda304cc6c7d2b3a3dd39229eab7 and see if that gets past this note.

I tried that and it made no difference.

Secondly, could you look at this note

The note no longer exists. More on this below...

Can you look at the debug_log.txt file and see:

  1. If any of those "Filename is..." lines are printed for AppleNotesEmbeddedPublicJpeg objects.
  2. If you can share all debug_log.txt file lines for this specific note (903)
  3. What specifically is the Type UTI for this object? The SQLite query to identify that would be:

Reply:

  1. Yes in lots of places, but for this particular note and attachment, the crash happens before that line I think as these are the last lines printed:

    D, [2022-08-01T11:41:21.000382 #54006] DEBUG -- : AppleNote: Note 903 replacing attachment 7513D337-3C2E-4C7F-B6D7-F379BF72171E
    D, [2022-08-01T11:41:21.000402 #54006] DEBUG -- : Note 903: Created a new Embedded Object of type public.jpeg
  2. D, [2022-08-02T08:38:27.507663 #12620] DEBUG -- : AppleNote: Note 903 replacing attachment B6B28160-A7B6-4668-A058-1F1F1E2553D2
    D, [2022-08-02T08:38:27.507917 #12620] DEBUG -- : Note 903: Created a new Embedded Object of type public.heic
    D, [2022-08-02T08:38:27.508010 #12620] DEBUG -- : Note 903: Created a new Embedded Object of type thumbnail
    D, [2022-08-02T08:38:27.508359 #12620] DEBUG -- : Copying /Users/huyz/Library/Group Containers/group.com.apple.notes/Accounts/[REDACTED]/Previews/B6B28160-A7B6-4668-A058-1F1F1E2553D2-1-192x143-0.png to ./output/2022_08_02-08_38_12/files/Accounts/[REDACTED]/Previews/B6B28160-A7B6-4668-A058-1F1F1E2553D2-1-192x143-0.png
    D, [2022-08-02T08:38:27.509257 #12620] DEBUG -- : Note 903: Created a new Embedded Object of type thumbnail
    D, [2022-08-02T08:38:27.509626 #12620] DEBUG -- : Copying /Users/huyz/Library/Group Containers/group.com.apple.notes/Accounts/[REDACTED]/Previews/B6B28160-A7B6-4668-A058-1F1F1E2553D2-1-384x288-0.png to ./output/2022_08_02-08_38_12/files/Accounts/[REDACTED]/Previews/B6B28160-A7B6-4668-A058-1F1F1E2553D2-1-384x288-0.png
    D, [2022-08-02T08:38:27.510778 #12620] DEBUG -- : Copying /Users/huyz/Library/Group Containers/group.com.apple.notes/Accounts/[REDACTED]/Media/A169406A-E4E2-424A-B979-0AAD8E744303/IMG_0316.heic to ./output/2022_08_02-08_38_12/files/Accounts/[REDACTED]/Media/A169406A-E4E2-424A-B979-0AAD8E744303/IMG_0316.heic
    D, [2022-08-02T08:38:47.893391 #12620] DEBUG -- : AppleNote: Note 903 replacing attachment 7513D337-3C2E-4C7F-B6D7-F379BF72171E
    D, [2022-08-02T08:45:25.492262 #12620] DEBUG -- : Note 903: Created a new Embedded Object of type public.jpeg
  3. ZTYPEUTI = public.jpeg

So the note is partially deleted:

I remember I created that note a few months ago on my iPhone. It is in a shared folder with someone else. And it was synced to my macbook.

Does this mean that Apple somehow corrupted the DB? Could it be that DB isn't properly garbage collected, or maybe there's some sort of soft-delete flag or something that apple_cloud_notes_parser needs to understand, or the note is corrupted in the DB and the Notes app simply skips over it without complaining or offering me a chance to properly nuke the note completely?

Any idea if there's a safe way to nuke that note in SQL without blowing up everything? :)

Thanks!

threeplanetssoftware commented 2 years ago

Fantastic debugging! Thank you!

I have played with deleting notes and, especially when iCloud syncing is concerned, I have found extremely mixed results. I'm sure there is a rhyme and reason but without digging into proprietary source, I don't know what it is (and I intentionally don't use anything proprietary). But I have noticed pictures sticking around on disk after a note was completely deleted before, so it wouldn't surprise me to find an example of the opposite, a note whose picture was already removed. Or, since you say you have a newer version of the note, perhaps a version that was migrated.

I wouldn't say this is "corrupting" the database, as much as Apple's products knowing which notes to ignore and which not to.

The upside to this is I now feel pretty confident that putting the option to skip the attachment if it doesn't exist won't be hiding an edge case I should catch. I will try to work this patch in shortly, once I figure out the best place to catch it.

threeplanetssoftware commented 2 years ago

I believe this is solved correctly in c22a67a. It will behave the same as if the filepath on disk doesn't exist. Please give this a try and see what you think.

huyz commented 2 years ago

Thank you!

It went a lot further. Here's the debug_log on Note 903:

D, [2022-08-05T20:42:48.842049 #33018] DEBUG -- : Rip Note: Ripping note from Note ID 903
D, [2022-08-05T20:42:48.842098 #33018] DEBUG -- : Rip Note: Looking up account for 3
D, [2022-08-05T20:42:48.842105 #33018] DEBUG -- : Rip Note: Looking up folder for 2
D, [2022-08-05T20:42:48.842148 #33018] DEBUG -- : Note 903: Created a new Embedded Object of type public.heic
D, [2022-08-05T20:42:48.842184 #33018] DEBUG -- : Note 903: Created a new Embedded Object of type public.jpeg
D, [2022-08-05T20:42:48.842216 #33018] DEBUG -- : Note 903: Created a new Embedded Object of type com.apple.notes.gallery

the processing went on about 6x longer and then we get a different kind of crash:

Created a new AppleBackup from Mac backup: /Users/huyz/Library/Group Containers/group.com.apple.notes
Guessed Notes Version: 15
Apple Decrypter: Attempting to decrypt objects without a password list set, check the -w option for more success
Updated AppleNoteStore object with 167 AppleNotes in 19 folders belonging to 1 accounts.
Adding the ZICNOTEDATA.ZPLAINTEXT and ZICNOTEDATA.ZDECOMPRESSEDDATA columns, this takes a few seconds
/Users/huyz/git/apple_cloud_notes_parser/lib/AppleNote.rb:589:in `+': no implicit conversion of nil into String (TypeError)
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNote.rb:589:in `block in generate_html_text'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNote.rb:562:in `each'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNote.rb:562:in `each_with_index'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNote.rb:562:in `generate_html_text'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNote.rb:527:in `generate_html'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNoteStore.rb:742:in `block in generate_html'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNoteStore.rb:740:in `each'
    from /Users/huyz/git/apple_cloud_notes_parser/lib/AppleNoteStore.rb:740:in `generate_html'
    from notes_cloud_ripper.rb:178:in `block (2 levels) in <main>'
    from notes_cloud_ripper.rb:177:in `open'
    from notes_cloud_ripper.rb:177:in `block in <main>'
    from notes_cloud_ripper.rb:171:in `each'
    from notes_cloud_ripper.rb:171:in `<main>'

Here, for the line

          html += @embedded_objects[embedded_object_index].generate_html# + "\n"

embedded_object_index is 0 and @embedded_objects looks like: screenshot 2022-08-06T035407Z

Strange to have some nils in there. Do you want me to look at something else?

Could also file a separate issue.

threeplanetssoftware commented 2 years ago

Yes, please open a new ticket. That doesn't seem to be the same root issue as this one. Thank you for the detailed feedback!