chrrel / whatsapp-exporter

A python script for extracting WhatsApp conversations from the app's SQLite database and exporting them as HTML or txt files.
GNU General Public License v3.0
90 stars 18 forks source link

Include Media #12

Closed txanetxarra closed 10 months ago

txanetxarra commented 10 months ago

Hi,

First of all, congratulations for your work on this project.

I've been testing it today and It's just what i've been searching for years.

It would be great if you can include media in the backup. I Love the format and interface of whatsapp-exporter,

As a reference, I'm also testing "WhatsApp-Chat-Exporter" (https://github.com/KnugiHK/WhatsApp-Chat-Exporter) but I prefer whatsapp-exporter, although "WhatsApp-Chat-Exporter" is able to include the media if the Whatsapp folder is contained in the database conversion folder.

Thank you again for all your work.

💪

chrrel commented 10 months ago

Hi @txanetxarra, thanks a lot for your kind words. You are not the first one to ask for this. I finally found some time to look into it and will keep you updated.

chrrel commented 10 months ago

Hi @txanetxarra, I have now included support for different media types:

1: "Image", 2: "Audio", 3: "Video", 4: "Contact", 5: "Location", 7: "System Message", 9: "Document",10: "Missed Call", 13: "Animated GIF", 14: "Multiple contacts", 15: "Deleted",16: "Live Location", 20: "Sticker"

There might always be some incompatibilities with less frequently used media types. As this is not an exact re-implementation of WhatsApp, this is neglectable in my opinion.

You can test the new version on the following branch:

https://github.com/chrrel/whatsapp-exporter/tree/feat_media_support

@karanrajpal14 You might be interested as well. This solves #9, too.

txanetxarra commented 10 months ago

Hey @chrrel,

Wow ... wow ... wow.... Great work !!

I've been testing it a little bit this morning and it's exactly what I needed.

I've been comparing around 15 different projects in github, and, This is the best whatsapp backup !

I'm impressed by how smooth it works. Great idea to load only the media that's being viewed and not all media in the chat. Great !!

From what Ive been able to see so far, I've got a couple comments;

1.- Usage: (in case it's useful for other users)

In this case, It's probably worth to mention that, in order to access the Media files, the backup html output must be placed inside the Whatsapp folder. In this way the relative path inside the HTML bakcup is able to access the media files.

My file locations:

I've downloaded the full "WhatsApp\" directory from the accessible storage of my phone (non rooted Android 9 and Crypt14 databases and key). I've placed my decrypted databases (obtained using the script from https://github.com/YuvrajRaghuvanshiS/WhatsApp-Key-Database-Extractor) inside the "WhatsApp\Databases\" directory in my PC.

CapturaX

With this, I've configured my "config.cfg" as:


[input]
# Path to the file msgstore.db
msgstore_path=./Whatsapp/Databases/msgstore.db
# Use external contacts database wa.db?
use_wa_db = True
# Path to the file wa.db
wa_path=./Whatsapp/Databases/wa.db

[output]
# Create an HTML export?
export_html=True
# Path to the HTML file to export
html_output_path=./Whatsapp/index.html
# Create an export to txt files?
export_txt=False
# Path to the directory for exporting txt files
txt_output_directory_path=./Whatsapp

With this config file, I run the main.py script, and I obtain the result file in:

CapturaXX

What I've noticed is that I can locate the input files wherever I want, but in order to get the media propertly shown in the chats, this is the only output location valid so far.

2.- Issues:

2.1.- Bussiness chats:

I'm susbscribed to LIDL WhatsApp newsletters. I've noticed that the media is not recongnized, although the seem to be regular jpg images. It's not that I need to backup these chats, but as I've noticed these behavior, I think it's worthy to let you know.

2.2.- Images shown in the wrong place.

I've noticed in one converstion, that some images are being placed not at the right place. In this case, is clear, as the first t

wo images sent by me in the conversation (one year ago) are being shown as the most recent images in the chat, but showing the right date.

The images of the original conversation:

Screenshot_20231031-093216 Screenshot_20231031-093233

Are being shown in the backup as:

Captura_

At the end of the chat, but showing the correct date.

3.- Feature requests

3.1.- It would be nice to have the option to copy / move the media files to the output directory.

3.2.- I find intersting the option of choosing a particular chat for doing the backup of a single chat. With the option above, it would be easy to have backups including media.

3.3.- Another improvement, could be to generate an HTML with the contact name and telephone number so, in case you see a backup in your PC in some time, you can retreive people's contact details. This is an interesting feature in case you've lost your contacts (phone lost / stolen, for example). Local backups are always useful, and this would be a nice complementary feature for "whatsApp-exporter".

3.4.- Also, It would be nice to have the option to choose start and end date for the desired backup..

4.- Thanks

Thank you so much for this development. I've been looking for alternatives for some time, and this solution is smart, light, platform independent and nice looking. Congratulations !!.

chrrel commented 10 months ago

Thanks for the compliments. :)

1.) Media paths: You are right, the media files need to be in the output directory for the tool to work correctly. I decided against making this configurable as the corresponding variable does not always contain path information. Nevertheless, I added a note to the README now (https://github.com/chrrel/whatsapp-exporter/commit/c1387ab1ae352390d9106da803ac2a80de41515e).

2.1) Business chats: Thanks for the hint. I do not have this in my test data and probably never will. ;)

2.2) Wrong message order: Good catch! Fortunately, I was able to reproduce this with my test data. Different time stamps were taken for sorting and displaying information. This is now fixed with one of the latest commits on this branch (https://github.com/chrrel/whatsapp-exporter/commit/146bd993c2b8377c6cf682d750c09c03027425bb). Thanks for the detailed bug report.

3.1) Copy files: This is out of scope for this project.

3.2) Limit to chat: I will not implement such a filter. Having this as an optional value in the configuration would cause more complexity in the code than needed for most use cases. If you need this, feel free to simply modify the SQL query. Depending on the WhatsApp version you use simply add something like and key_remote_jid = '12345678@s.whatsapp.net' in line 21 or and cv.raw_string_jid = '12345678@s.whatsapp.net' in line 57 of models.py.

3.3) Include phone numbers: This should already be included but is rather hidden. It should be displayed when hovering the contact name. Alternatively, use your Browser's "Copy link" context menu. The phone number is used as an ID here.

3.4) Limit to time frame: See 3.2)

txanetxarra commented 10 months ago

@chrrel,

Thank you so much for your detailed answer-

1) Thank you for the clarification.

2.1) Yes, I just wanted let you know 😉👍.

2.2) Great, tested it again and it's workinkg like a charm.

3.1) Yes, I understand.

3.2) Great hint, I've been playing around and I finally have it working. Just a couple comments in case it's useful for other users:

I've made the changes in main.py instead of models.py. In main.py, I've modified:

      Line 21

      from:
      `HERE key_remote_jid =:key_remote_jid`
      to:
      `HERE key_remote_jid =:key_remote_jid and key_remote_jid = '1234257890@s.whatsapp.net'`

      Line 57

      from:
      `HERE cv.raw_string_jid =:key_remote_jid`
      to:
      `HERE cv.raw_string_jid =:key_remote_jid and cv.raw_string_jid = '1234257890@s.whatsapp.net'`

Where 1234257890 is the target cell phone number including county code.

3.3) My mistake, I haven't explored the interface enogugh.

3.4) Yes, I understand these suggestions have many side effects.

Best regards.