ReagentX / imessage-exporter

Export iMessage data + run iMessage Diagnostics
GNU General Public License v3.0
3k stars 126 forks source link

Support Handwritten messages #31

Closed ReagentX closed 1 month ago

ReagentX commented 1 year ago

Details are here: https://support.apple.com/en-us/HT206894

raleighlittles commented 1 year ago

@ReagentX Is this currently being worked on?

ReagentX commented 1 year ago

Handwritten (com.apple.Handwriting.HandwritingProvider) messages store their data in a payload_data BLOB, and the data inside is pretty difficult to understand.

image
raleighlittles commented 1 year ago

Interesting. I don't know anything about iOS development, or that specific library (I tried searching for it and nothing came up).

The handwritten messages on the UI side are handled like images, so maybe there's a Base64 representation there? I see an "=" at byte 0x95, so maybe that's where the sequence ends.

Maybe send several different handwritten messages, and then diff the payload data blob across each of them to try to look for commonalities? @ReagentX

jnopnop commented 1 year ago

@ReagentX , there’s an XZ archive magic number at offset 78: FD 37 7A 58 5A 00. I actually tried stripping off everything before this sequence and the rest is indeed a valid archive. However the unXZipped binary doesn’t look like anything meaningful. unfortunately I didn’t manage to solve it, maybe you’ll have better luck

trymoose commented 1 month ago

I had luck with protoscope. Not sure how to parse HandwritingData from here.

syntax = 'proto3';

message Handwriting {
  // Milliseconds since 2001-01-01 UTC.
  // Duplicate across reused handwriting.
  // Built in handwritings had negative values.
  sfixed64 CreatedAt = 2;
  // Duplicate across reused handwriting.
  string UUID = 3;
  // Always present
  HandwritingData data = 4;
}

enum DataType {
  // Never seen, required for protobuf
  Unknown = 0;
  Raw = 1;
  // XZ format
  Compressed = 4;
}

message HandwritingData {
  // Not sure
  // Content varies, always 4 bytes long
  // Hex repeats the pattern [F284??8?]
  // Only tested with single send/receiver, but each party had unique ID.
  // Built in handwritings had their own id.
  bytes CreatorID = 2;
  // Content varies, always 8 bytes long
  // Hex repeats the pattern [??8???80??8???80]
  bytes unknown_data_2 = 3;
  // Number of times the pen entered and left the screen
  // Also the number of arrays in HandwritingData
  int64 NumStrokes = 4;
  DataType Type = 5;
  // If DataType is Compressed, size of decompressed data
  optional int64 DecompressedHandwritingDataSize = 6;
  // Always 4
  int64 unknown_1 = 7;
  // Either compressed payload or raw data.
  // In raw form it is in this format:
  // [NumStrokes]struct{
  //    Count uint16 // little endian
  //    Elems [Count][8]byte // Always [8]byte{?? ?? ?? ?? ?? 0x80 0xFF 0x7f}
  // }
  bytes HandwritingData = 8;
}
trymoose commented 1 month ago

I created a utility that converts a handwritten payload to svg. It's written in go, but should provide enough reference for converting the payload into a usable image.

ReagentX commented 1 month ago

@trymoose, if you have time would you mind explaining a bit about how you managed to reverse engineer this format? I would be very interested to learn about your process.

trymoose commented 1 month ago

Sure! I uploaded an explanation on how I decoded the data here.

ReagentX commented 1 month ago

Really, really cool work. I appreciate the insight!

ReagentX commented 1 month ago

This is merged and will be in the Prickly Pear release. Cheers @trymoose for the extremely high quality contributions!