carderne / signal-export

Export your Signal chats to markdown files with attachments
Other
481 stars 50 forks source link

No index.md for some contacts #13

Closed MiningTalent closed 2 years ago

MiningTalent commented 3 years ago

I constantly save once per week my signal conversations. But I recognised that for 'new' contacts, no index.md file is generated although there are media in the media folder from the respective conversations.

How can that be? I do not 'overwrite' the old export. I just switch to the respective folder in anaconda virtual environment terminal and then export the conversations with your command './sigexport.py Outputdir'.

Thanks in advance :)

carderne commented 3 years ago

Hi @MiningTalent, I've just added some additional logging to the script.

Could you try running with the verbose flag, e.g.:

./sigexport.py outputdir --verbose

You will get a huge output, but you should be able to tell something from that (and share a redacted version if useful). The output is indented as follows:

main task
    doing that task for a person/conversation
        doing an individual message/attachment

You want to see what gets printed out after Creating markdown files to see what's going wrong. (It might be easier to pipe the output to a text file e.g. ./sigexport.py outputdir --verbose > log.txt so you can scan it more easily!

MiningTalent commented 3 years ago

Hi @carderne, thanks for your support.

Unfortunately, I couldn't read something out of the log.txt file why for some contacts no index.md is created. I just get:

Fetching data from /Users/"user"/Library/Application Support/Signal/sql/db.sqlite
    Loading SQL results for: "Respective Contact*s"
    ...
Copying and renaming attachments
    Copying attachments for: "RespectiveContacts"
    ...

(in this section, characters like spaces and other special characters are "deleted" as exemplified before) Creating markdown files -> and now the "respective_contact" is not listet in this section...

These 3 sections are present in the respective log.txt file. So, all in all, your script fetches the data from the respective folder and does copy and rename attachements but the last step of creating markdown files is left over for some contacts.

Hope this helped you to solve my problem. Thanks again in advance for your effort! :)

carderne commented 3 years ago

I'm a bit baffled... So the conversation appears in the Fetching data and Copying and renaming sections, but not in the Creating markdown bit. I wouldn't have thought that possible, as there is no modification to the contacts/conversations objects between those, and they loop in the exact same way (see below).

I've just made a small change, would you mind downloading the latest code and trying again?

If that doesn't work, I might ask you to edit the script locally and add some more print statements so we can see what's up...

https://github.com/carderne/signal-export/blob/a474b9f6233b84040bd60611ffb864ac7084237d/sigexport.py#L84-L91

https://github.com/carderne/signal-export/blob/a474b9f6233b84040bd60611ffb864ac7084237d/sigexport.py#L37-L46

MiningTalent commented 3 years ago

Thanks again. Unfortunately, it didn't work out again... Where should I put some print statements? Any suggestions?

carderne commented 3 years ago

Ok. We want to rule out some weirdness happening between these functions. Go to line 563:

https://github.com/carderne/signal-export/blob/a474b9f6233b84040bd60611ffb864ac7084237d/sigexport.py#L560-L565

And add print statements around it as follows:

diff --git a/sigexport.py b/sigexport.py
index 3cf1151..4e37d88 100755
--- a/sigexport.py
+++ b/sigexport.py
@@ -560,7 +560,12 @@ def main(
     convos, contacts = fetch_data(db_file, key, manual=manual, chat=chat)
     contacts = fix_names(contacts)
     print("\nCopying and renaming attachments")
+    from pprint import pprint
+    with open("contacts-before.txt", "w") as f: pprint(contacts, f)
+    with open("convos-before.txt", "w") as f: pprint(convos, f)
     copy_attachments(src, dest, convos, contacts)
+    with open("contacts-after.txt", "w") as f: pprint(contacts, f)
+    with open("convos-after.txt", "w") as f: pprint(convos, f)
     print("\nCreating markdown files")
     make_simple(dest, convos, contacts)

Then run it and you'll get before and after files for convos and contacts, which you can run in a diff-ing tool, e.g. vimdiff convos-before.txt convos-after.txt and same for contacts (if you're on Linux just install vim and vimdiff comes along for free, otherwise use an online tool I guess...).

When I do this, the only difference in convos is that the attachment filenames have changed, and contacts is identical before and after. Let me know what you see!

MiningTalent commented 3 years ago

Ok, same appears with my export. I get those stats for contacts:

Description betweenFiles 1 and 2
Unchanged   1   2086
Changed     0   0
Inserted    0   0
Removed     0   0

and these stats for attachment filenames:

Description between Files 1 and 2 
Text        Blocks  Lines
Unchanged   164 157564
Changed     163 383
Inserted    0   0
Removed     0   0

I think it's the same you see if you perform the export... :-/

carderne commented 3 years ago

I don't quite understand your message, but I think you're saying that you get the same result as me, i.e. the before and after are basically the same?

So the conversations and contacts are essentially unchanged, but nonetheless there are some contacts/conversations that do appear in the ...-after.txt files, but don't appear in the Creating markdown portion of the log output...

One last thing to try. Please delete the print statements you added above and add the following in the make_simple() function.

diff --git a/sigexport.py b/sigexport.py
index 3cf1151..0ac50a5 100755
--- a/sigexport.py
+++ b/sigexport.py
@@ -84,6 +84,9 @@ def copy_attachments(src, dest, conversations, contacts):
 def make_simple(dest, conversations, contacts):
     """Output each conversation into a simple text file."""

+    from pprint import pprint
+    with open("contacts-ms.txt", "w") as f: pprint(contacts, f)
+    with open("convos-ms.txt", "w") as f: pprint(conversations, f)
     dest = Path(dest)
     for key, messages in conversations.items():
         name = contacts[key]["name"]

Presumably the excluded contact will appear in contacts-ms.txt but not in the log output under Creating markdown? Please find the contact name in contacts-ms.txt and match its id/label (something like 36c34e82-692a-2a8a-ab56-238cd942ab3d) with the conversation in convos-ms.txt. If you find it there but not in the Creating markdown log output, then I'm afraid your computer is being bombarded with neutrinos or something, because the statement to print the log output is just three lines later...

(While we're here please let me know your OS and Python version.)