carderne / signal-export

Export your Signal chats to markdown files with attachments
Other
481 stars 50 forks source link

--old flag is blowing up on duplicate attachments #74

Closed elizwillinger closed 2 years ago

elizwillinger commented 2 years ago

Desktop (please complete the following information):

Describe the bug This may be related to [ issue #64 ]. The use case here is that I have some conversations that are set to disappear messages after some interval. Sometimes I wish to save important bits that shouldn't disappear. So the goal is to merge the new, soon-to-disappear texts with the old exported-and-saved-but-already-disappeared-from-the-signal-db texts in addition to all the chats that dont disappear texts that may have attachments that will need to be deduplicated. It may be that such a use case is impossible as currently implemented, I don't know.

When I run sigexport with the appropriate flags (see reproduction steps), it blows up with an error message

'signal-chats/brochat/media/2022-01-14T21-03-59.741_00_None.jpeg' are the same file

Stacktrace is attached below. Please note that it looks like it dumped an entire copy of the signal DB in json to the console, and completely overloaded my buffer. I can't go back in the output to see if anything preceded it, and I'm not sure how to redirect the output to a log file. sigexport foo >>./logfile.txt didnt work.

sigexport stack.txt

To reproduce

Steps to reproduce the behavior. Please include the exact commands tried.

  1. sigexport ./signal-chats-new --source /mnt/c/Users/Eli/AppData/Roaming/Signal/ --old ./signal-chats --paginate=0 --overwrite

sigexport stack.txt

carderne commented 2 years ago

So it looks this is simply caused by duplicate attachments. The filenames are specific enough that it should be safe (??) to ignore duplicates... what do you think?

I could just add a try...except around the copy code and warn when there are duplicates.

(Regarding the json being dumped to your console: currently if you use the default install instructions with a docker backend, it... uses the console to communicate between the docker container and the host process... on the agenda to do something better sooner or later, but no one has complained yet.)

If you're happy editing a file, find sigexport/main.py and make the following change and see if it works for you:

@@ -379,7 +379,14 @@ def lines_to_msgs(lines: List[str]) -> List[List[str]]:
 def merge_attachments(media_new: Path, media_old: Path) -> None:
     for f in media_old.iterdir():
         if f.is_file():
-            shutil.copy2(f, media_new)
+            try:
+                shutil.copy2(f, media_new)
+            except shutil.SameFileError:
+                print(f"Skipped {f}")

Or just wait a bit and I'll push up the change.

elizwillinger commented 2 years ago

Made the change and it doesn't blow up anymore. But now I'm hitting the #64 problem, I think. I'll talk about it over there.

Thanks!