bepaald / signalbackup-tools

Tool to work with Signal Backup files.
GNU General Public License v3.0
755 stars 36 forks source link

Bad MAC while trying to import from desktop backup #209

Closed ImJustToNy closed 4 months ago

ImJustToNy commented 4 months ago

At first, thank you for this awesome application and all the work you've put into it :1st_place_medal: I have a problem, when trying to merge my desktop backup with android backup:

[root@e2fe51767e77 cntn-signalbackup-tools]# signalbackup-tools signal-2024-04-30-23-40-46.backup "[REDACTED]" --importfromdesktop
 *** Starting log: 2024-05-04 09:06:08 *** 
signalbackup-tools (signalbackup-tools) source version 20240425.195630
BACKUPFILE VERSION: 1
BACKUPFILE SIZE: 3918876071
COUNTER: 923636432
Reading backup file: 100.0%... done!
Database version: 228
[Warning]: Foreign key constraint violated.
--------------------------------------
| table         | parent      | fkid |
--------------------------------------
| msl_recipient | msl_payload | 0    |
| msl_message   | msl_payload | 0    |
--------------------------------------
[Error]: BAD MAC! (pagenumber: 262145)
         MAC in file: (hex:) 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
         Calculated : (hex:) 72 84 59 cf df c8 32 a8 52 bf bd 13 b8 0c e2 43 14 cc 9a 23 2f 73 8c 3b 9d 98 a1 61 57 24 c6 92 f6 f4 06 fe 88 51 b2 ce 09 e5 72 a3 85 e0 49 1b 5f 94 6b 4b 12 fd 95 df f6 1e c1 1d ef 7b 5a 52
[Error]: Failed to open Signal Desktop sqlite database

I tried debugging it, however, I have little to no clue why it might be this way. Perhaps corrupted message? Interestingly enough, same thing happened when doing the same thing on my other computer, also running same signal version, so I doubt that it is a problem with this particular DB. Any help appreciated.

Versions:

[root@e2fe51767e77 cntn-signalbackup-tools]# sqlite3 --version
3.45.1 2024-01-30 16:01:20 e876e51a0ed5c5b3126f52e532044363a014bc594cfefa87ffb5b82257ccalt1 (64-bit)

image Fedora 40

bepaald commented 4 months ago

Hi! Thanks for your report.

This looks very strange to me. First of all, I see two unrelated problems in the output you posted:

[Warning]: Foreign key constraint violated.
--------------------------------------
| table         | parent      | fkid |
--------------------------------------
| msl_recipient | msl_payload | 0    |
| msl_message   | msl_payload | 0    |
--------------------------------------

This is a problem in the Android backup you are opening, and should not naturally occur. Has anything been done to this backup? As it is, this backup will probably not restore on Signal (though this constraint violation is easily fixed).

Then the stranger one:

[Error]: BAD MAC! (pagenumber: 262145)
         MAC in file: (hex:) 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
         Calculated : (hex:) 72 84 59 cf df c8 32 a8 52 bf bd 13 b8 0c e2 43 14 cc 9a 23 2f 73 8c 3b 9d 98 a1 61 57 24 c6 92 f6 f4 06 fe 88 51 b2 ce 09 e5 72 a3 85 e0 49 1b 5f 94 6b 4b 12 fd 95 df f6 1e c1 1d ef 7b 5a 52
[Error]: Failed to open Signal Desktop sqlite database

So, in the Desktop database (which is a sqlcipher database), each page is encrypted and then a checksum (SHA-512 hash) of this encrypted data is appended to the page. This checksum is the MAC. A bad MAC would normally indicate corrupted (encrypted) data, so it wouldn't have anything to do with corrupted messages (it hasn't gotten to those at this point yet).

The MAC read from file being all 0x00's is also an indication that the file might be corrupt.

However, the fact that you have this on two Desktop databases is curious. I'm assuming these are separate instances (not actual copies)? And the error is the same (including the exact pagenumber and calculated MAC)? Could you tell me the size of the database? It's located at ~/.config/Signal/sql/db.sqlite by default. I find the pagenumber at which it fails almost too special to be a coincidence: it is after 262144 pages, each 4096 bytes, which is 1073741824 bytes (exactly 1GiB), which I think is absolutely huge for a sql database. I almost think the program is reading past the end of the file (which could also explain the all 0x00's MAC), but don't see why it would do that.

Do these databases still work with Signal Desktop itself? It doesn't show any errors and messages can be sent and received normally? Since the database is a normal sqlcipher database, you could also consider — if you feel up to it — to just open it with another program (as long as the sqlcipher extention is installed). If you want to try this, the key is at ~/.config/Signal/config.json, I think there are a couple of how-to's available on the internet.

Currently, I must admit, I do not know what is going on, but I'll give it some thought. I hope some of the answers to my questions will also give me a hint.

Thanks again!

bepaald commented 4 months ago

Of course, just as I hit 'comment', another thought comes to me.

Since this program never lets unencrypted bytes touch disk, the entire Desktop database is decrypted in memory. If the database is really that big (> 1GiB), maybe you're running out of RAM? Or there is some OS limitation on how much memory the process can allocate?

I will try and figure out a way to reduce the memory used when opening a Desktop database, while the Android-side is quite optimized, I never gave it much thought on the Desktop-side.

ImJustToNy commented 4 months ago

Hi! Thank you very much for your answers. A lot of information here.

Tip about opening database via sqlcipher will come handy. I tried to open it in the datagrip and it haven't worked because I didn't realize it was encrypted. I will try to open it later. I'll keep you posted.

ImJustToNy commented 4 months ago

I've decrypted the DB and it looks all right for me. I will try to run --importfromdesktop from windows machine. Perhaps it is some kind of linux limitation I'm not aware of.

bepaald commented 4 months ago

Thanks for reporting back. I've made some changes. Could you please update and try again?

If the problem persists, there should be (slightly) different output, which I'd like to see.

If the problem is fixed, and the import works, let me know if you need help fixing that foreign key constraint violation in the Android database.

Thanks!

bepaald commented 4 months ago

A little more information:

Apparently sqlite3 could under some conditions insert an empty page in the database-file. This page is all-zeros. When this happens, sqlcipher skips all encrypting and check-summing and also inserts an entire page of zeros (plus an all-zero MAC).

Currently, this is my hypothesis on what you are experiencing.

Reading the sqlcipher source, these pages can pretty much be skipped (more precisely: the empty page can be used as 'decrypted' data itself). So this is what this program now does.

However, I don't know how to get sqlite3 to actually insert an empty page in a real world scenario, so I am not able to properly test this. I did manually zero out an entire page of data using hexedit, and that seems to work with the new code. Still, even if the problem appears solved for you, I'd love for you to be extra thorough in checking if any data is missing. You could attempt --exporthtml on the merged backup to check before restoring the backup on a phone, though I understand fully checking every message is going to be hard given the size of the database.

ImJustToNy commented 4 months ago

Thank you very much! It went through, here is the log, in case you need it :)

 *** Starting log: 2024-05-04 22:01:41 *** 
signalbackup-tools (signalbackup-tools) source version 20240504.211242
BACKUPFILE VERSION: 1
BACKUPFILE SIZE: 3918876071
COUNTER: 923636432
Reading backup file: 100.0%... done!
Database version: 228
[Warning]: Foreign key constraint violated.
--------------------------------------
| table         | parent      | fkid |
--------------------------------------
| msl_recipient | msl_payload | 0    |
| msl_message   | msl_payload | 0    |
--------------------------------------
[Warning]: Found Sqlite-WAL file (write-ahead logging).
           Desktop data may not be fully up-to-date.
           Maybe Signal Desktop has not cleanly shut down?
           (pass `--ignorewal' to disable this warning)
[Error]: Failed to open Signal Desktop sqlite database
[root@49c5e15c384f cntn-signalbackup-tools]# signalbackup-tools signal-2024-04-30-23-40-46.backup "467846832757197713687997820016" --importfromdesktop
 *** Starting log: 2024-05-04 22:02:22 *** 
signalbackup-tools (signalbackup-tools) source version 20240504.211242
BACKUPFILE VERSION: 1
BACKUPFILE SIZE: 3918876071
COUNTER: 923636432
Reading backup file: 100.0%... done!
Database version: 228
[Warning]: Foreign key constraint violated.
--------------------------------------
| table         | parent      | fkid |
--------------------------------------
| msl_recipient | msl_payload | 0    |
| msl_message   | msl_payload | 0    |
--------------------------------------
Trying to match conversation (1/5) (type: private)
 - Importing 114 messages into thread._id 2
[Warning]: Unhandled message type 'delivery-issue'. Skipping message. (this warning will be shown only once)
Trying to match conversation (2/5) (type: private)
 - Importing 68 messages into thread._id 5
Trying to match conversation (3/5) (type: private)
 - Importing 24694 messages into thread._id 1
[Warning]: Failed to get number of attachments in quoted message. Skipping
[Warning]: Failed to get number of attachments in quoted message. Skipping
[Warning]: Failed to get number of attachments in quoted message. Skipping
[Warning]: Failed to get number of attachments in quoted message. Skipping
[Warning]: Failed to get number of attachments in quoted message. Skipping
Trying to match conversation (4/5) (type: private)
 - Importing 942 messages into thread._id 11
Trying to match conversation (5/5) (type: group)
 - Importing 419876 messages into thread._id 13
[Warning]: Unsupported message type 'group-v2-change'. Skipping... (this warning will be shown only once)
reorderMmsSmsIds
updateThreadsEntries
  Dealing with thread id: 2, 4, 1, 6, 5, 8, 12, 7, 9, 10, 11, 13
Checking foreign key constraints...
[Error]: Foreign key constraint violated. This will not end well, aborting.

Please report this error to the program author.
--------------------------------------
| table         | parent      | fkid |
--------------------------------------
| msl_recipient | msl_payload | 0    |
| msl_message   | msl_payload | 0    |
--------------------------------------

My last question is about foreign key constraints violations, you've mentioned that it is easily fixed, can you please point me in the right direction? :smile:

bepaald commented 4 months ago

Excellent! Output looks good as well. That was an actual bug in my sqlcipher implementation, that I did not know of before. Thanks for reporting, this is how this program gets a little bit better every time.


My last question is about foreign key constraints violations, you've mentioned that it is easily fixed, can you please point me in the right direction? 😄

So, for some reason, there appear to be entries in the msl_recipient table and the msl_message table which reference a non-existing msl_payload. We need to delete these. I think the following should do the trick:

--runsqlquery "DELETE FROM msl_message WHERE payload_id NOT IN (SELECT _id FROM msl_payload)" --runsqlquery "DELETE FROM msl_recipient WHERE payload_id NOT IN (SELECT _id FROM msl_payload)"

You could just add this to the command you used above, or you could run with just these options on the already-imported backup,* or on the Android backup before importing and then import after fixing as a second step. It shouldn't make much difference.

You will still see the warning in the beginning (right after the Android backup is opened), then somewhere during the process you'll see something like this:

 * Executing query: DELETE FROM msl_message WHERE payload_id NOT IN (SELECT _id FROM msl_payload)
Modified X rows
 * Executing query: DELETE FROM msl_recipient WHERE payload_id NOT IN (SELECT _id FROM msl_payload)
Modified X rows

And then, hopefully, the error is gone. Be sure to also add an output option (-o) to actually save the modified backup.

Let me know how it goes!

Thanks!

* EDIT I suppose running it on the backup after the import is not possible as the program will not write an output with the constraint violation present.

ImJustToNy commented 4 months ago

Thank you very much for all your help. Everything works as expected. I've sent you a small gift on paypal :)

bepaald commented 4 months ago

Happy to help, glad it all worked out! And thank you again for reporting this bug, and your useful comments.

Your donation is very much appreciated as well, thank you so much!