GjjvdBurg / signal2html

Export a Signal backup to pretty HTML
MIT License
111 stars 15 forks source link

Missing Conversations in html export folder #48

Closed soyasis closed 3 years ago

soyasis commented 3 years ago

I am experiencing missing conversations from the backup, getting multiple of the the following errors when running signal2html:

Couldn't find attachment '/signal-backup/Attachment_978_1552816065840.bin'. Maybe it was deleted or never downloaded.

Followed by the following trace back:

Traceback (most recent call last):
  File "/usr/local/bin/signal2html", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/signal2html/__main__.py", line 15, in main
    sys.exit(realmain())
  File "/usr/local/lib/python3.9/site-packages/signal2html/ui.py", line 35, in main
    process_backup(args.input_dir, args.output_dir)
  File "/usr/local/lib/python3.9/site-packages/signal2html/core.py", line 672, in process_backup
    dump_thread(t, output_dir)
  File "/usr/local/lib/python3.9/site-packages/signal2html/html.py", line 268, in dump_thread
    body = format_message(body, thread.mentions.get(msg._id))
  File "/usr/local/lib/python3.9/site-packages/signal2html/html.py", line 72, in format_message
    mention = mentions.get(i)
AttributeError: 'NoneType' object has no attribute 'get'

Could this be an issue from the signalbackup-tools executable? The mentioned files seem not to have been extracted to the signal_backup folder. Has Anybody else experienced similar issues? If so I will open the issue in there (my first-ever issue in github!).

--> This is such a great project for archiving chats, thanks for the awesome work!!!! :)

GjjvdBurg commented 3 years ago

Hi @plasticfruits, thanks for raising this issue and glad to hear you like the project!

The first warning you get is unrelated to the error. Sometimes we can't find an attachment (if it's been deleted for instance) but that only gives a warning and is usually not a problem. I also don't think this is an issue with signalbackup-tools, as they simply decrypt and extract the backup, whereas the error seems to be in what we do with the database.

I've made an attempt at fixing it in #49. Perhaps you can test to see if this solves it for you? You can install the version from #49 using:

pip install -U git+https://github.com/GjjvdBurg/signal2html.git@bugfix/missing_mentions

P.S.: Well done on your first github issue! :+1:

ericthegrey commented 3 years ago

Hi, did you by any chance also get an error message as follows: Failed to load quote mentions for message (NNN)...

Thanks

soyasis commented 3 years ago

Hi, did you by any chance also get an error message as follows: Failed to load quote mentions for message (NNN)...

Thanks

Hi, I only got the above mentioned warnings and trace-back, no other warnings

soyasis commented 3 years ago

Hi @plasticfruits, thanks for raising this issue and glad to hear you like the project!

The first warning you get is unrelated to the error. Sometimes we can't find an attachment (if it's been deleted for instance) but that only gives a warning and is usually not a problem. I also don't think this is an issue with signalbackup-tools, as they simply decrypt and extract the backup, whereas the error seems to be in what we do with the database.

I've made an attempt at fixing it in #49 . Perhaps you can test to see if this solves it for you? You can install the version from #49 using:

pip install -U git+https://github.com/GjjvdBurg/signal2html.git@bugfix/missing_mentions

P.S.: Well done on your first github issue! 👍

Thanks for the quick response @GjjvdBurg! :) I was checking the fix in #49 and I realised I was probably not clear enough explaining the issue (apologies for that!).

Issue: After running signal2html there are missing contacts (i.e. folder with contact name) inside the signal_html folder. That is, no contact folder, no .html file nor attachments present.

Other unexpected behaviour: When running singnal2html on a different backup from the same day it extracted a few (5/6) new contacts that had not been extracted with the other backup while other where missing.

Let me know if there is anything else I can do to support, I will try to run it in debug mode later see if I get anything that could be helpful from the logs.

Thanks!!! ☺️

GjjvdBurg commented 3 years ago

Hi @plasticfruits,

You're indeed describing two different issues! :D Presumably the fix in #49 solves the trace back that you're getting?

After running signal2html there are missing contacts (i.e. folder with contact name) inside the signal_html folder. That is, no contact folder, no .html file nor attachments present.

Do these contacts have messages in Signal? Contacts without any messages are not exported.

If you are able to do some debugging, you could check the database.sqlite file in the raw backup and see if you can find the messages from the contacts that are not exported.

soyasis commented 3 years ago

Hi @GjjvdBurg ,

If you are able to do some debugging, you could check the database.sqlite file in the raw backup and see if you can find the messages from the contacts that are not exported.

I checked database.sqlite and there are multiple contacts missing with messages (i filtered for threads with >10 message_count and also hard-checked one contact's messages in the sms table to make sure they are there)

Also I realised that there are some groups missing but I could not find a table in the db for group messages where I could check that they are not empty.

Presumably the fix in #49 solves the trace back that you're getting?

Unfortunately I got the same trace back message after downloading the fix. Just to double check, I run pip install -U git+https://github.com/GjjvdBurg/signal2html.git@bugfix/missing_mentions followed by signal2html -i signal_backup/ -o signal_html/ --> is that correct?

Let me know if there is something else I can support with, good chance for me to learn! ☺️

GjjvdBurg commented 3 years ago

Hi @plasticfruits,

Apologies for the slow response, I was away for a bit.

I'm still not quite sure I understand why you're getting this error. There's essentially nothing in the code that stands out to me as a possible cause: we get the threads, populate it with messages, and then write it to html unless there are no messages.

Just to double check:

I'm surprised the fix in #49 didn't work for you, even though what you're posting does look correct. I've just updated the package with this fix, so perhaps you can try it again as follows:

pip install -U signal2html                     # update the package
signal2html -V                                 # show the version number (should be v0.2.6)
signal2html -i signal_backup/ -o signal_html/
soyasis commented 3 years ago

Hi @GjjvdBurg,

I've also been away for a bit, hope you had a nice time out! :-)

Great news, after updating this time it worked! I must of not have updated correctly before. Thanks for following up, its really a fantastic and very valuable tool that you have here! 🥇 😃

GjjvdBurg commented 3 years ago

Glad to hear it works now @plasticfruits!