sballin / alfred-search-notes-app

Use Alfred to quickly open notes in iCloud/Apple Notes.
https://www.alfredforum.com/topic/11716-search-appleicloud-notes/
MIT License
510 stars 24 forks source link

Help request: understanding the note data format #26

Closed sballin closed 4 years ago

sballin commented 4 years ago

The note body text data in the database contains some non-text bytes that I don't understand. I'm interested in cleanly extracting the plaintext and raw link URLs, but I've been having trouble coming with general enough rules to do it right.

If you recognize this format, please let me know. Here's what I know so far:

Note data is stored in the database in gzip DEFLATE format. After decompression, it looks like this:

[short amount of non-text bytes]

[note title plaintext]

[note body plaintext (what the user sees in the Notes app—links appear as the user-set text rather than the raw URL if applicable)]

[bytes 26 16]

[non-text bytes of length roughly proportional to length of note]

[raw URLs of all links in body text, each preceded by a ~9-byte sequence that starts with 42 and followed by some more non-text bytes]

bloatfan commented 4 years ago

@sballin

you can see this series articles

https://ciofecaforensics.com/2020/01/10/apple-notes-revisited/

https://ciofecaforensics.com/2020/01/13/apple-notes-revisited-easy-embedded-objects/

https://ciofecaforensics.com/2020/01/14/apple-notes-revisited-embedded-tables/

https://ciofecaforensics.com/2020/01/20/apple-notes-revisited-galleries/

sballin commented 4 years ago

Just saw those yesterday, very useful! I'll push a commit adding table support soon. I also want to figure out how to process the protobuf in Go, just to get link text in a clean way. If you have any idea, let me know!

bloatfan commented 4 years ago

Just saw those yesterday, very useful! I'll push a commit adding table support soon. I also want to figure out how to process the protobuf in Go, just to get link text in a clean way. If you have any idea, let me know!

Have you see this repos apple_cloud_notes_parser;

you can try parse noteBytes by using notestore.proto

step1:

protoc --go_out=plugins=grpc:. notestore.proto 

# put notestore.pb.go to notestore folder

step2:

add this code main.go#L273

note := notestore.NoteStoreProto{}
err = proto.Unmarshal(noteBytes, &note)

println(note.Document.Note.NoteText)
println(note.Document.Note.String())

for _, item := range note.Document.Note.AttributeRun {
    // handle item、and parse all AttributeRun 
    println(item.String())
}

try it @sballin

sballin commented 4 years ago

Thanks a lot! I've added this in the latest commit, and here is a compiled version. I'll test it out for a few days before releasing.

Search Notes.alfredworkflow.zip

bloatfan commented 4 years ago

Thanks a lot! I've added this in the latest commit, and here is a compiled version. I'll test it out for a few days before releasing.

Search Notes.alfredworkflow.zip

OK,i will try it ,👍