bepaald / signalbackup-tools

Tool to work with Signal Backup files.
GNU General Public License v3.0
790 stars 38 forks source link

After update of Android Signal to version 6.28.5 the --exporthtml no longer works #135

Closed hugogithubs closed 1 year ago

hugogithubs commented 1 year ago

signalbackup-tools with backup files created with Android version of Signal previous to version 6.28.5 worked perfectly.

After update of Android Signal to version 6.28.5 the --exporthtml no longer works and also --listthreads (see below) no longer works.


log for --exporthtml:

C:\signalbackup>signalbackup-tools_win.exe signal-XXX.backup XXX --exporthtml ./testoutputfolder
signalbackup-tools (signalbackup-tools_win.exe) source version 20230730.120659 (OpenSSL)
IV: (hex:) XXX (size: 16)
SALT: (hex:) XXX (size: 32)
BACKUPKEY: (hex:) XXX (size: 32)
CIPHERKEY: (hex:) XXX (size: 32)
MACKEY: (hex:) XXX (size: 32)
BACKUPFILE VERSION: 1
COUNTER: 4195006925
Reading backup file...
FRAME 30027 (100.0%)... Read entire backup file...

done!

Database version: 201
Error : During sqlite3_prepare_v2(): no such column: uuid
  Query: "SELECT _id FROM recipient WHERE uuid IN (SELECT address FROM identities WHERE identity_key IS ?)"
Error : During sqlite3_prepare_v2(): no such column: uuid
  Query: "SELECT _id FROM recipient WHERE uuid IN (SELECT address FROM identities WHERE identity_key IS ?)"
Dealing with thread 1
Error : During sqlite3_prepare_v2(): no such column: recipient.uuid
  Query: "SELECT COALESCE(NULLIF(recipient.system_display_name, ''), NULLIF(recipient.profile_joined_name, ''),NULLIF(recipient.signal_profile_name, ''), NULLIF(groups.title, ''), NULLIF(recipient.phone, ''), NULLIF(recipient.uuid, ''),  recipient._id) AS 'display_name', recipient.phone, recipient.username, recipient.uuid, NULLIF(recipient.chat_colors, ''),recipient.group_id, recipient.color, recipient.wallpaper FROM recipient LEFT JOIN groups ON recipient.group_id = groups.group_id WHERE recipient._id = ?"
Error : During sqlite3_prepare_v2(): no such column: recipient.uuid
  Query: "SELECT COALESCE(NULLIF(recipient.system_display_name, ''), NULLIF(recipient.profile_joined_name, ''),NULLIF(recipient.signal_profile_name, ''), NULLIF(groups.title, ''), NULLIF(recipient.phone, ''), NULLIF(recipient.uuid, ''),  recipient._id) AS 'display_name', recipient.phone, recipient.username, recipient.uuid, NULLIF(recipient.chat_colors, ''),recipient.group_id, recipient.color, recipient.wallpaper FROM recipient LEFT JOIN groups ON recipient.group_id = groups.group_id WHERE recipient._id = ?"
Dealing with thread 2
Error : During sqlite3_prepare_v2(): no such column: recipient.uuid
  Query: "SELECT COALESCE(NULLIF(recipient.system_display_name, ''), NULLIF(recipient.profile_joined_name, ''),NULLIF(recipient.signal_profile_name, ''), NULLIF(groups.title, ''), NULLIF(recipient.phone, ''), NULLIF(recipient.uuid, ''),  recipient._id) AS 'display_name', recipient.phone, recipient.username, recipient.uuid, NULLIF(recipient.chat_colors, ''),recipient.group_id, recipient.color, recipient.wallpaper FROM recipient LEFT JOIN groups ON recipient.group_id = groups.group_id WHERE recipient._id = ?"
Dealing with thread 3
Error : During sqlite3_prepare_v2(): no such column: recipient.uuid
  Query: "SELECT COALESCE(NULLIF(recipient.system_display_name, ''), NULLIF(recipient.profile_joined_name, ''),NULLIF(recipient.signal_profile_name, ''), NULLIF(groups.title, ''), NULLIF(recipient.phone, ''), NULLIF(recipient.uuid, ''),  recipient._id) AS 'display_name', recipient.phone, recipient.username, recipient.uuid, NULLIF(recipient.chat_colors, ''),recipient.group_id, recipient.color, recipient.wallpaper FROM recipient LEFT JOIN groups ON recipient.group_id = groups.group_id WHERE recipient._id = ?"
Dealing with thread 4
Dealing with thread 5
Error : During sqlite3_prepare_v2(): no such column: recipient.uuid
  Query: "SELECT COALESCE(NULLIF(recipient.system_display_name, ''), NULLIF(recipient.profile_joined_name, ''),NULLIF(recipient.signal_profile_name, ''), NULLIF(groups.title, ''), NULLIF(recipient.phone, ''), NULLIF(recipient.uuid, ''),  recipient._id) AS 'display_name', recipient.phone, recipient.username, recipient.uuid, NULLIF(recipient.chat_colors, ''),recipient.group_id, recipient.color, recipient.wallpaper FROM recipient LEFT JOIN groups ON recipient.group_id = groups.group_id WHERE recipient._id = ?"

log for --listthreaDS C:\signalbackup>signalbackup-tools_win.exe signal-XXX.backup YYY --listthreads

signalbackup-tools (signalbackup-tools_win.exe) source version 20230730.120659 (OpenSSL)
IV: (hex:) XXX (size: 16)
SALT: (hex:) XXX (size: 32)
BACKUPKEY: (hex:) XXX (size: 32)
CIPHERKEY: (hex:) XXX (size: 32)
MACKEY: (hex:) XXX (size: 32)
BACKUPFILE VERSION: 1
COUNTER: 4195006925
Reading backup file...
FRAME 30027 (100.0%)... Read entire backup file...

done!
Database version: 201
Database version: 201
---------------------------------
| Min Date      | Max Date      |
---------------------------------
| 1595892449601 | 1691766161054 |
---------------------------------
Error : During sqlite3_prepare_v2(): no such column: recipient.phone
  Query: "SELECT thread._id, COALESCE(recipient.phone, recipient.group_id) AS 'recipient_ids', thread.snippet, COALESCE(recipient.system_display_name, recipient.profile_joined_name,recipient.signal_profile_name, groups.title) AS 'Conversation partner' FROM thread LEFT JOIN recipient ON thread.recipient_id = recipient._id LEFT JOIN groups ON recipient.group_id = groups.group_id ORDER BY thread._id ASC"
---------------------------------
| Min Date      | Max Date      |
---------------------------------
| 1595892449601 | 1691766161054 |
---------------------------------
bepaald commented 1 year ago

Thanks for reporting! A few columns seem to have been renamed recently, hopefully that's all that has changed. I'll try to get things updated shortly (either today, or tomorrow).

hugogithubs commented 1 year ago

I forgot to mention that I also previously deleted some oldest media attachments in Signal App -> Media Overview of a chat. I don't know if it can also have something to do with that.

bepaald commented 1 year ago

I forgot to mention that I also previously deleted some oldest media attachments in Signal App -> Media Overview of a chat. I don't know if it can also have something to do with that.

I don't think that should have any effect on this programs functionality.

I have just pushed a fix for this issue. It's possible I missed some places, but I think it should be good. Let me know if it works for you!

hugogithubs commented 1 year ago

As far as I've used it now (--exporthtml) it works perfectly with no problems! I am definitely impressed that you responded so quickly and resolved my issue so quickly. Thank you for your quick reaction. I've just started some days ago using your project and haven't even looked at the sources yet. What surprises me is that "--dbinfo" was identical on the old and new files. Was this change to the db definition come from Signal foundation?

kohms commented 1 year ago

I can absolutely second that, quite impressive in which detail you and only a few contributiors keep this project current and working. I was looking for a good solution to archive and offload my signal conversations and found no good native answer from the signal project. As much as I like their idea of privacy, I found that missing export feature quite limiting, as I share family pictures over signal groups and want to keep them long term. I store them not in the cloud and just need a format which is viewable in let's say 10 years from now, potentially longer. I found other projects which were able to decrypt the protobuf file which has a bit of complexity already. But also parsing the sqlite db and rendering a nice HTML page is also not that trivial. I used to do that on my own in the past, but every few month when I dumped a new backup, I needed to adjust the statements. I now switched entirely to using your project and the results look way better than my own, also with features like pagination and the overview pages. Just wanted to say thank you and keep up with your great work 🙂 There is one thing which I would love to see to implement 100% of my use cases, which would be a full text search, but it's not that huge of a problem, as I can grep for text, but not from within a browser, if I have pagination and multiple exports. But potentially that could mean running a backend or implement a search in Javascript... So I can fully understand if that is out of scope here.

bepaald commented 1 year ago

Thank you both for your kind words!

I do always try to keep the program up-to-date. Sometimes I will have things in order before the changes are released since I run the beta on a couple of phones. This time the changes caught me off guard (which honestly is not that rare either). I really can't stand it if I know things aren't working so if I get a report like this and I have some time, I tend to fix it quickly.

But, if this happens in a busier time for me, it could happen that (some of) the functionality is broken for a few days. But today was sunday and I had nothing planned.

What surprises me is that "--dbinfo" was identical on the old and new files.

There were many changes in the recipient table. If you check --showdbinfo, the current version would list recipient|aci where the previous version had recipient|uuid, among others. Signal changes the names of columns in the tables quite regularly, and the way they have written the program it is a small change for them. But because I try to stay backwards compatible with all previous database versions (mostly) and did not think about changing column names when I first started writing this program, it is a bit of a hassle for me when they do. But so far it works. I do often try to think of some design where changing column names would be easier, but haven't quite figured it out yet.

There is one thing which I would love to see to implement 100% of my use cases, which would be a full text search, but it's not that huge of a problem, as I can grep for text, but not from within a browser, if I have pagination and multiple exports. But potentially that could mean running a backend or implement a search in Javascript... So I can fully understand if that is out of scope here.

Obviously, any browser will have proper text search built in. But when paginating, this obviously doesn't work. I'm not even sure how (or if) this would be possible with HTML and javascript, but my knowledge is limited. It is an interesting problem, though, I'll do some research. (but don't hold your breath)

Thanks again!

kohms commented 1 year ago

Cool, thanks :-) Just as a pointer, static site generators sometimes rely on such a search function (e.g.: https://gohugo.io/tools/search/). The project https://lunrjs.com/ seem to offer the right tools for the Javascript based search, but the generation of an index would still be necessary and is dependent on the individual page. https://lunrjs.com/guides/getting_started.html#creating-an-index I guess for bigger archives it would be necessary to serialize the index as JSON upfront, looks a bit cleaner to me anyway: https://lunrjs.com/guides/index_prebuilding.html

As a side note (might be different for anybody), my backup files tend to be around 7 - 10GB before I trim them back to 30 days of history. The size is mostly due to images and videos, but I have daily conversations with multiple groups, on my last 7,6GB backup file the exported sqlite db was 43,3MB vs 7,6GB of media files. The conversation history contained 15 month of data. I tried to find a sweet spot between required pagination to not crash the browser and convenient in-browser search, I don't really now how long lunrjs would perform, but I guess about 50MB could work, but I never tried so far. Probably the most scalable solution would be something like a sqlite or even postgresql DB which can be filled by a backend and display the data dynamically. I could imagine something like a signal message viewer which would offer the backend and "simple" frontend and could import data processed by signalbackup-tools. The great thing would be, that you don't have a number of archives, only one single point to enter and to e.g. select the time frame you are interested in. But yeah... that's probably a way bigger undertaking than dumping HTML, Javascript and JSON files to disk, especially as the design would require that much flexibility to adopt to changes made by signal.

bepaald commented 1 year ago

I've been playing with building a search database and some javascript to search it. Though not lunrjs or any other external js, just something written myself. It will be simpler (no fuzzy matching and such, but regex will work), but I do plan om implementing something. No time frame though, could still be a while. The feature could always be extended and improved if it is working.

I'll post back here when i have something to test.

bepaald commented 1 year ago

@kohms Maybe you want to chime in at #141? I don't think it's exactly what you asked for, but I'm curious if it meets (some of) your needs.

kohms commented 1 year ago

Cool, thanks for trying, I'll have a look into that and directly reply in https://github.com/bepaald/signalbackup-tools/issues/141 🙂