bepaald / signalbackup-tools

Tool to work with Signal Backup files.
GNU General Public License v3.0
755 stars 36 forks source link

Delete unused contacts option #199

Closed Meteor0id closed 5 months ago

Meteor0id commented 5 months ago

I feel bad about opening so many feature requests but I thought of another thing I would use. Currently Signal does not delete contacts as we discusses previously in the feature request for listing all contacts. Maybe it would be easy to cross check with the Signal android backup and the desktop database if any of the found contacts are orphaned, and delete those entries from the backup file. Is that an option the tool could support? --deleteOrphanedContacts By clearing all data on Android and importing from the modified backup you would clean up any old references on your phone. (dektop would require deleting all data and re-linking to achieve a cleaned up client).

I am not sure why Signal isn't deleting orphaned contacts by default yet but it sure would be nice not to carry years of history around in my pocket.

bepaald commented 5 months ago

I feel bad about opening so many feature requests but I thought of another thing I would use.

No worries.

Maybe it would be easy to cross check with the Signal android backup and the desktop database if any of the found contacts are orphaned

It is very much not easy to do this, unfortunately. Recipients can be referenced in the database in many places, and in many different ways (usually by _id, but also by their phone numbers or by uuid/aci, often in plain text, but sometimes as a binary blob or packed in a protocol buffer or encoded in base64 (or a combination thereof)).

As an alternative to --deleteOrphanedContacts, the program supports --findrecipient [N] which attempts to list all the places a recipient is found in the database. If this function would work correctly, and the recipient is not found anywhere, it should be safe to delete them with --runsqlquery "DELETE FROM recipient WHERE _id = [N].

Generally, I believe all recipients always have an entry in the identities table, which should probably then also be deleted (usually: "DELETE FROM identities WHERE address IS (SELECT aci FROM recipient WHERE _id = [N])", but that will be different depending on the age of the backup file). If the recipient is referenced in more places, you could of course start deleting more and more, but — just like the recipients themselves — anything else you delete might be referenced by other database entries which will then point to deleted data.

If Signal comes upon a reference to a non-existing recipient (or any other non-existing data) it will crash (or the backup will fail to restore at all).

I personally do not fully trust the --findrecipient function to work flawlessly, and definitely do not recommend attempting this on important data. I believe it is much too error prone for too little gains (no gains at all?). So I will not be creating an option for this, I'd rather leave it as a manual thing for users who (think they) know what they are doing.

I do occasionally make improvements to the findrecipients function (I updated it just now, in fact), but I think that's as far as I'll go. I'm going to close this issue, but if you have more questions or remarks, please feel free.

Thanks!

(ref: #114 )