Closed alexkolson closed 1 year ago
Do you know if Notes stores hyperlinks in the same way that it does other text formatting? The biggest issue I have dealing with Notes content is that most export methods (including apple-notes-liberator
) don't preserve links.
For example, if you add some text to a note, e.g. "Wikipedia", and convert it into a link by selecting the text and pressing cmd-k then setting the destination to "https://www.wikipedia.org/", this works as a normal, clickable hyperlink within Notes. But once this note has been exported, you're left with just the word "Wikipedia" with no link/URL.
The only way I've found to preserve links is to select all the text in the note, copy it to the clipboard, and then paste it into another app that supports rich text. Not ideal for multiple notes, although the process could be scripted.
To sum up, I'm wondering whether "preserve links" is a separate feature request, or if it'd come "free" with text formatting. I spent some time browsing the database and it wasn't obvious to me where Apple is storing this information.
Hi there!
I've added rudimentary support for extracting links in v1.1.0
. Each extracted note now has a links
property, which contains a list of links, in order, extracted from the note. See Output Format for more information. This is only probably minimally useful, if at all, because the links are taken out of context of where they are in the note. It's a start though, and it should be enough to build upon in order to support links when implementing something like #19.
apple_cloud_notes_parser
extracts and preserves links in various output formats already, so you might try using that and see if it suits you. It is a really awesome project without which I wouldn't have been able to get even the slightest handle on Notes structure.
Yeah Notes does store hyperlinks in a similar way to other text formatting. It keeps a list of what it calls attribute runs
in the note protobuf. I like to think of these as formatting information objects. You can then go through the attribute runs, see what kind it is, and then map it back to its location in the actual note via the attribute run's length
property. Each attribute run
has a length
property, and the sum total length of all attribute runs in the list will equal the length of the note itself. Thus you can figure out where the attribute run
(or formatting information object as I like to think of it) belongs in the note by taking the sum of the lengths of all previous attribute runs in the list.
// Pseudocode
// myNoteProto is a NoteStoreProto
note = myNoteProto.document.note
noteText = note.noteText
formattingInformationObjects = note.attributeRunList
formattingObjectNoteTextIndex = 0
for each formattingInfo in formattingInformationObject
// current formattingInfo is located in note text at formattingObjectNoteTextIndex and continues for formattingInfo.length
// we could for example replace the note text between formattingObjectNoteTextIndex and formattingInfo.length with whatever
// representation of formattingInfo suits us best
// then we move along the formattingObjectNoteIndex in preparation for the next formattingInformationObject
formattingObjectNoteTextIndex += formattingInfo.length
Right now I'm not replacing any actual note text, but extracting the link text follows the same procedure. You can see it here:
I hope this helps a little bit!
Useful for #19.