bepaald / signalbackup-tools

Tool to work with Signal Backup files.
GNU General Public License v3.0
792 stars 38 forks source link

Reaction Encoding Parser #15

Closed inferrinizzard closed 3 years ago

inferrinizzard commented 4 years ago

Does anyone know what the encoding for the reaction emojis is? It seems to be a 32-length char string like (censored)

bepaald commented 4 years ago

The reaction list is a Google protobuffer object (https://developers.google.com/protocol-buffers, the definition is here: https://github.com/signalapp/Signal-Android/blob/master/app/src/main/proto/Database.proto).

Off the top of my head (I'm not at home right now and unable to check) it's just a raw binary blob in the sqlite database, though the string you show here looks to be base64 encoded so I might be wrong about that, or that's just how whichever viewer you are using displays binary data.

The string can be any length depending on the number of reactions and which emoji is used. If you are seeing 32 bytes consistently, that's a coincidence. My app deals with reaction lists (only limited for some specific cases, such as merging backups) here: https://github.com/bepaald/signalbackup-tools/tree/master/reactionlist

If there is anything you specifically want to be able to edit, let me know and I could try to implement it when I have time.

Btw: I edited your post to remove the example string as I thought the reaction list protobuf contained the phone number of the sender, but I just remembered that it currently uses a senders id, which is basically just a random number. So it was unnecessary for me to edit you post, but I did it thoroughly and can't undo.

inferrinizzard commented 4 years ago

I tried parsing the bytestring with python protobuf but it didn't seem to work for me, does your implementation have a functionality to parse? I'm only looking to get the list of emojis and discarding the rest of the data.

bepaald commented 4 years ago

If you are looking to do this for just a few specific reactionlists, I would just get the data from the SQL database, for example from my program (adjust the selection criteria to get your specific message):

$ ./signalbackup-tools --fast backupfile.backup 949456645454874240551258765425 --runprettysqlquery "SELECT reactions FROM sms WHERE thread_id IS 17 AND date IS 1581683436188 AND reactions IS NOT NULL"
 [...]
done!
 * Executing query: SELECT reactions FROM sms WHERE thread_id IS 17 AND date IS 1581683436188 AND reactions IS NOT NULL
----------------------------------------------------------------------------------
| reactions                                                                      |
----------------------------------------------------------------------------------
| (hex:) 0a 16 0a 04 f0 9f 91 8d 10 47 18 c8 ac cf 9d 84 2e 20 c8 ac cf 9d 84 2e |
----------------------------------------------------------------------------------

And then paste that here: https://protogen.marcgravell.com/decode (select 'hexa' as input in this case), and you get:

Results

Field #1: 0A String Length = 22, Hex = 16, UTF8 = " 👍GȬϝ�. Ȭϝ�."

As sub-object : Field #1: 0A String Length = 4, Hex = 16, UTF8 = "👍" Field #2: 91 Varint Value = 71, Hex = 8D Field #3: 10 Varint Value = 1581683824200, Hex = 47-18-C8-AC-CF-9D Field #4: 84 Varint Value = 1581683824200, Hex = 2E-20-C8-AC-CF-9D

If you need to automate this and/or do it for a large number of reactionlists, I can implement something for you. The program is capable of parsing (and does it internally when required), there is just no user-facing function that prints the results out. But it should be trivial to write a function for that purpose. Let me know if you need this done. Would you want the function to be able to select the thread and/or message id? Or should it just dump all reactions in the database?

bepaald commented 3 years ago

Just to clean up, I am closing this issue assuming my last comment has been helpful or a different solution was found. Feel free to continue posting here (or in a new issue) if you feel the need to.