KnugiHK / WhatsApp-Chat-Exporter

A customizable Android and iOS/iPadOS WhatsApp database parser that will give you the history of your WhatsApp conversations in HTML and JSON. Android Backup Crypt12, Crypt14, Crypt15, and new schema supported.
https://wts.knugi.dev/
MIT License
521 stars 76 forks source link

No such table: messages (Edit: Support for WhatsApp databases with "message" table) #9

Closed Takepy closed 1 year ago

Takepy commented 2 years ago

When running the script I get following error message

Gathering contacts...(97)
Traceback (most recent call last):
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\Scripts\wtsexporter.exe\__main__.py", line 7, in <module>
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\site-packages\Whatsapp_Chat_Exporter\__main__.py", line 147, in main
    messages(db, data)
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\site-packages\Whatsapp_Chat_Exporter\extract.py", line 88, in messages
    c.execute("""SELECT count() FROM messages""")
sqlite3.OperationalError: no such table: messages

My database doesn't contain the table messages, I only have one called message which contains the data but the schema of the table looks different from what it's supposed to look like. 2022-02-14_22-28-04 What could be the issue here?

KnugiHK commented 2 years ago

What is the WhatsApp version and which platform (Android/iOS) are you using?

Takepy commented 2 years ago

I'm using Android. WhatsApp versions that have been used are 2.22.3.77 and 2.22.5.7 (Beta) 2.22.3.77 was running on an emulator so that I could pull the /com.whatsapp/ directory. On the 2.22.5.7 version I used the previously pulled key to decrypt the db.

KnugiHK commented 2 years ago

Are both versions and environments result in same database schema?

Takepy commented 2 years ago

Yes, they both resulted in the same database schema

KnugiHK commented 2 years ago

Interesting. I will try to reproduce it with 2.22.3.77 on emulator.

KnugiHK commented 2 years ago

I tried to login WhatsApp with 2.22.3.77 on emulator. However, the database includes the "messages" table. Do you remember when you installed WhatsApp on the emulator for the first time (i.e. creation date of the database)?

Takepy commented 2 years ago

I installed WhatsApp on the emulator on the 9th of this month and then imported my chat history using the chat backup in WhatsApp. On my actual phone, WhatsApp has been installed like 6 months ago.

KnugiHK commented 2 years ago

I checked some similar projects and suspect that "message" table is a legacy table for storing messages. On your phone, is the WhatsApp conversation restored from old backup after installation?

Takepy commented 2 years ago

I restored the WhatsApp conversations from Google Drive (doing weekly backups). Never deleted my WhatsApp chat history though so the messages go all the way back to 2013

KnugiHK commented 2 years ago

As far as I observed, it is possible that "message" was used in the early years of WhatsApp. Could you check if new installation without backup will result the same situation?

Takepy commented 2 years ago

I'll give it a try later and report back.

Takepy commented 2 years ago

Sorry, it took a little bit longer until I had time to try it out. Setup a new emulator, installed WhatsApp and didn't import my backup. The table messages exists now.

So I'm guessing my backup is just too old for it to work then? :(

KnugiHK commented 2 years ago

Sorry, it took a little bit longer until I had time to try it out.

No problem!

So I'm guessing my backup is just too old for it to work then? :(

I think so. I would like to add support for older version. However, I don't have such database, hence, I am not able to do it. If you have time and you want the script to work with your database, you can take a look inside the database and fine tune the script. Anyway, I will keep my eyes on the opportunity for supporting old databases. Thanks for your report!

Unbrick commented 2 years ago

Hey everyone! I just stumbled upon the same issue with WhatsApp version 2.22.7.73 on Android on two different devices using two different phone numbers without restoring any backup.

I'll try to reinstall WhatsApp and try it again, but it seems like the database scheme is changing. The new messages table has the following schema:

CREATE TABLE message (
  _id INTEGER PRIMARY KEY AUTOINCREMENT, 
  chat_row_id INTEGER NOT NULL, from_me INTEGER NOT NULL, 
  key_id TEXT NOT NULL, sender_jid_row_id INTEGER, 
  status INTEGER, broadcast INTEGER, 
  recipient_count INTEGER, participant_hash TEXT, 
  origination_flags INTEGER, origin INTEGER, 
  timestamp INTEGER, received_timestamp INTEGER, 
  receipt_server_timestamp INTEGER, 
  message_type INTEGER, text_data TEXT, 
  starred INTEGER, lookup_tables INTEGER, 
  message_add_on_flags INTEGER, sort_id INTEGER NOT NULL DEFAULT 0
);

Edit: Tried a second time, database scheme is still the same.

Edit2: In WhatsApp version 2.21.21.17 both tables messages and message do seem to exist.

KnugiHK commented 2 years ago

I'll try to reinstall WhatsApp and try it again, but it seems like the database scheme is changing. The new messages table has the following schema:

I can produce the same result on 2.22.7.73 and "messages" table no longer exist.

@Takepy As I can reproduce your situation now, I will work on supporting "message" table.

KnugiHK commented 2 years ago

I am curious about how WhatsApp decides which table to store messages 🀨. Is "message" a legacy table? Or "messages" is the legacy one?

Unbrick commented 2 years ago

I am curious about how WhatsApp decides which table to store messages 🀨. Is "message" a legacy table? Or "messages" is the legacy one?

I'm not certain, using the outdated WhatsApp version both tables exist, in the more recent version (downloaded from whatsapp.com yesterday) only the message table existed.

Edit: Just checked, in my current setup (outdated version) messages are only stored in the messages table (which works awesome with your script!) and the message table appears to be empty.

KnugiHK commented 2 years ago

WhatsApp using "message" table spread different types of message to many other tables. Good chance for me to rewrite the project :)

karanrajpal14 commented 2 years ago

@Takepy were you able to find a way around this issue? Looks like I'm in the same boat, unfortunately and my whatsapp database is using the "legacy" format as well. I've got messages dating back all the way to 2014.

josh-shaw-dev commented 2 years ago

@karanrajpal14 I came across the same issues as well, got a new phone and couldnt export the messages. Did you have any luck getting it to work?

I've started on a c# project to extract the messages which is working to extract and dump to text, just working on the html part now

ulno commented 1 year ago

Has anyone made some progress here - it now seems as if it has to do with new versions of whatsapp? Or does it only affect people with a very long message history? I am very ready to back everything up and try moving to a full puppeteered whatsapp with mautrix-whatsapp, but things seem to still be shifting way too much.

@KnugiHK any way we can help?

KnugiHK commented 1 year ago

Has anyone made some progress here - it now seems as if it has to do with new versions of whatsapp? Or does it only affect people with a very long message history? I am very ready to back everything up and try moving to a full puppeteered whatsapp with mautrix-whatsapp, but things seem to still be shifting way too much.

@KnugiHK any way we can help?

The hardest part of this project is to understand the structure of the WhatsApp database. Any clues on the structure and pull requests are welcome.

ulno commented 1 year ago

I played with it a bit yesterday and replaced the main query in extract.py with this:

SELECT message.sender_jid_row_id as key_remote_jid,
                        message._id,
                        message.from_me as key_from_me,
                        message.timestamp,
                        message.text_data as data,
                        message.status,
                        message_future.version as edit_version,
                        message_thumbnail.thumbnail as thumb_image,
                        message_media.file_path as remote_resource,
                        message_media.mime_type as media_wa_type,
                        message_location.latitude,
                        message_location.longitude,
                        message_quoted.key_id as quoted,
                        message.key_id,
                        message_quoted.text_data as quoted_data,
                        message_media.media_caption
                 FROM message
                    LEFT JOIN message_quoted
                        ON message_quoted.message_row_id = message._id
                    LEFT JOIN message_location
                        ON message_location.message_row_id = message._id
                    LEFT JOIN message_media
                        ON message_media.message_row_id = message._id
                    LEFT JOIN message_thumbnail
                        ON message_thumbnail.message_row_id = message._id
                    LEFT JOIN message_future
                        ON message_future.message_row_id = message._id
                    WHERE key_remote_jid <> '-1';

That brings me past the missing messages table, but will clash in extract.py:240 trying to match the remote id to something in the big data dictionary. So, I assume me trying to replace key_remote_jid with sender_jid_row_id might go into the right direction, but might need some more joints to actually turn things into a phone number.

There are also a messages_view and a chats_view now that seem to match some more tables with each other. Did anybody else figure some dependencies based on their connections out?

KnugiHK commented 1 year ago

I played with it a bit yesterday and replaced the main query in extract.py with this:

SELECT message.sender_jid_row_id as key_remote_jid,
                        message._id,
                        message.from_me as key_from_me,
                        message.timestamp,
                        message.text_data as data,
                        message.status,
                        message_future.version as edit_version,
                        message_thumbnail.thumbnail as thumb_image,
                        message_media.file_path as remote_resource,
                        message_media.mime_type as media_wa_type,
                        message_location.latitude,
                        message_location.longitude,
                        message_quoted.key_id as quoted,
                        message.key_id,
                        message_quoted.text_data as quoted_data,
                        message_media.media_caption
                 FROM message
                    LEFT JOIN message_quoted
                        ON message_quoted.message_row_id = message._id
                    LEFT JOIN message_location
                        ON message_location.message_row_id = message._id
                    LEFT JOIN message_media
                        ON message_media.message_row_id = message._id
                    LEFT JOIN message_thumbnail
                        ON message_thumbnail.message_row_id = message._id
                    LEFT JOIN message_future
                        ON message_future.message_row_id = message._id
                    WHERE key_remote_jid <> '-1';

That brings me past the missing messages table, but will clash in extract.py:240 trying to match the remote id to something in the big data dictionary. So, I assume me trying to replace key_remote_jid with sender_jid_row_id might go into the right direction, but might need some more joints to actually turn things into a phone number.

There are also a messages_view and a chats_view now that seem to match some more tables with each other. Did anybody else figure some dependencies based on their connections out?

I will have a look at the clash later.

stevanuscolonne commented 1 year ago

I am curious about how WhatsApp decides which table to store messages 🀨. Is "message" a legacy table? Or "messages" is the legacy one?

"messages" is the legacy one, please read https://thebinaryhick.blog/2022/06/09/new-msgstore-who-dis-a-look-at-an-updated-whatsapp-on-android/

stevanuscolonne commented 1 year ago

I played with it a bit yesterday and replaced the main query in extract.py with this:

SELECT message.sender_jid_row_id as key_remote_jid,
                        message._id,
                        message.from_me as key_from_me,
                        message.timestamp,
                        message.text_data as data,
                        message.status,
                        message_future.version as edit_version,
                        message_thumbnail.thumbnail as thumb_image,
                        message_media.file_path as remote_resource,
                        message_media.mime_type as media_wa_type,
                        message_location.latitude,
                        message_location.longitude,
                        message_quoted.key_id as quoted,
                        message.key_id,
                        message_quoted.text_data as quoted_data,
                        message_media.media_caption
                 FROM message
                    LEFT JOIN message_quoted
                        ON message_quoted.message_row_id = message._id
                    LEFT JOIN message_location
                        ON message_location.message_row_id = message._id
                    LEFT JOIN message_media
                        ON message_media.message_row_id = message._id
                    LEFT JOIN message_thumbnail
                        ON message_thumbnail.message_row_id = message._id
                    LEFT JOIN message_future
                        ON message_future.message_row_id = message._id
                    WHERE key_remote_jid <> '-1';

That brings me past the missing messages table, but will clash in extract.py:240 trying to match the remote id to something in the big data dictionary. So, I assume me trying to replace key_remote_jid with sender_jid_row_id might go into the right direction, but might need some more joints to actually turn things into a phone number.

There are also a messages_view and a chats_view now that seem to match some more tables with each other. Did anybody else figure some dependencies based on their connections out?

key_remote_jid same with raw_string from table jid

syscrypt commented 1 year ago

Are there any updates on this?

KnugiHK commented 1 year ago

Are there any updates on this?

Not yet. Hopefully there will be updates on Dec.

newdrkhckr commented 1 year ago

Are there any updates on ?

newdrkhckr commented 1 year ago

Found a solution but i am not sure every aspect of project is working properly.

First https://github.com/andreas-mausch/whatsapp-viewer/issues/151#issuecomment-1229244368 I applied this solution this solition modified old msgstore.db.

After that i still recieved errorS. So with SQLlite i add new tables to overcome error: no such table: ... After adding couple tables, code is working now. And able to output a result.

But as i said i cannot confirm that every aspect of project is working properly.

KnugiHK commented 1 year ago

Hi everyone, I am working on porting the code to the WhatsApp database with new schema. The main part is completed and there are some bugs need to be fixed.

KnugiHK commented 1 year ago

I created a new branch called "message_table" which support message table directly from wtsexporter command. Feel free to give it a try but please expect bugs from the branch. Bug reports are welcome and should be reported in this issue.

Also, big thank for the SQL statement written by @ulno. For the replacement of key_remote_jid, since sender_jid_row_id could be 0, instead of treating it as the key_remote_jid, I joined the chat and jid table together to obtain the accurate key_remote_jid.

Takepy commented 1 year ago

First of all, thank you for working on this. I just downloaded the branch and gave it a try. Unfortunately it didn't work for me.

╰─ wtsexporter -a
Gathering contacts...(97)
Traceback (most recent call last):
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\Scripts\wtsexporter.exe\__main__.py", line 7, in <module>
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\site-packages\Whatsapp_Chat_Exporter\__main__.py", line 174, in main
    messages(db, data)
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\site-packages\Whatsapp_Chat_Exporter\extract.py", line 195, in messages
    c.execute("""SELECT count() FROM messages""")
sqlite3.OperationalError: no such table: messages

I made sure to uninstall the old version I had with pip uninstall whatsapp-chat-exporter and then installed the new one. Is there anything I'm missing?

KnugiHK commented 1 year ago

First of all, thank you for working on this. I just downloaded the branch and gave it a try. Unfortunately it didn't work for me.

╰─ wtsexporter -a
Gathering contacts...(97)
Traceback (most recent call last):
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\Scripts\wtsexporter.exe\__main__.py", line 7, in <module>
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\site-packages\Whatsapp_Chat_Exporter\__main__.py", line 174, in main
    messages(db, data)
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\site-packages\Whatsapp_Chat_Exporter\extract.py", line 195, in messages
    c.execute("""SELECT count() FROM messages""")
sqlite3.OperationalError: no such table: messages

I made sure to uninstall the old version I had with pip uninstall whatsapp-chat-exporter and then installed the new one. Is there anything I'm missing?

Can you verify that the __main__.py file import extract_new from Whatsapp_Chat_Exporter import extract_new as extract instead of extract from Whatsapp_Chat_Exporter import extract, extract_iphone

Takepy commented 1 year ago

__main__.py did not include from Whatsapp_Chat_Exporter import extract_new as extract. Replaced the files in \site-packages\Whatsapp_Chat_Exporter manually with the ones from the branch now and ran the script again. Now being faced with this

╰─ wtsexporter -a
Gathering contacts...(97)
Traceback (most recent call last):
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\Scripts\wtsexporter.exe\__main__.py", line 7, in <module>
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\site-packages\Whatsapp_Chat_Exporter\__main__.py", line 184, in main
    messages(db, data)
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\site-packages\Whatsapp_Chat_Exporter\extract_new.py", line 206, in messages
    c.execute("""SELECT jid.raw_string as key_remote_jid,
sqlite3.OperationalError: no such column: message_media.media_caption
KnugiHK commented 1 year ago

__main__.py did not include from Whatsapp_Chat_Exporter import extract_new as extract. Replaced the files in \site-packages\Whatsapp_Chat_Exporter manually with the ones from the branch now and ran the script again. Now being faced with this

╰─ wtsexporter -a
Gathering contacts...(97)
Traceback (most recent call last):
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\Scripts\wtsexporter.exe\__main__.py", line 7, in <module>
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\site-packages\Whatsapp_Chat_Exporter\__main__.py", line 184, in main
    messages(db, data)
  File "C:\Users\******\AppData\Local\Programs\Python\Python310\lib\site-packages\Whatsapp_Chat_Exporter\extract_new.py", line 206, in messages
    c.execute("""SELECT jid.raw_string as key_remote_jid,
sqlite3.OperationalError: no such column: message_media.media_caption

It should be fixed in the latest commit: https://github.com/KnugiHK/Whatsapp-Chat-Exporter/commit/d3892a4e4f4f0b533107f871a23de22153cc879a.

cbzittoun commented 1 year ago

The new branch worked on my side -- thanks a lot ! Although all medias are missing from the htmls (i tried both with and without the -e flag). Thanks a lot for your work, your tool is great.

velecto1 commented 1 year ago

The new branch worked on my side -- thanks a lot ! Although all medias are missing from the htmls (i tried both with and without the -e flag). Thanks a lot for your work, your tool is great.

Hello, @cbzittoun , I had the same problem with the media. Try to pass the parent directory of "Media" instead of the address to "Media" itself.

I still have three issues, though:

cbzittoun commented 1 year ago

I think I was pointing to no Media folder actually, as I was running the cmd for the 2nd time. I was expecting the script to copy the files, not move them (although slower, it would be safer).

https://github.com/KnugiHK/Whatsapp-Chat-Exporter/blob/580eaddb24ee359433593776f1b6643e01d91979/Whatsapp_Chat_Exporter/extract.py#L531

Maybe it is worth an explicit warning that input files are being moved around?

KnugiHK commented 1 year ago

@cbzittoun

The new branch worked on my side -- thanks a lot ! Although all medias are missing from the htmls (i tried both with and without the -e flag). Thanks a lot for your work, your tool is great.

Actually, embedding media to the HTML file (-e option) is removed and was presented in the very early version. It is planned to have embedding again. Media should be located under WhatsApp folder (i.e., WhatsApp/Media). If your current working directory has the folder named WhatsApp, it will automatically move that folder. If you want to specify another directory, use -m.

I think I was pointing to no Media folder actually, as I was running the cmd for the 2nd time. I was expecting the script to copy the files, not move them (although slower, it would be safer).

https://github.com/KnugiHK/Whatsapp-Chat-Exporter/blob/580eaddb24ee359433593776f1b6643e01d91979/Whatsapp_Chat_Exporter/extract.py#L531

Maybe it is worth an explicit warning that input files are being moved around?

We move the discussion of this to https://github.com/KnugiHK/Whatsapp-Chat-Exporter/issues/25.

KnugiHK commented 1 year ago

@velecto1

I still have three issues, though:

* All exported conversations start with "This message is not supported".
* The flag "-s" seems not to be working.
* I've realized that it would also be helpful for me to split the result into multiple HTML files see_no_evil.

Can you confirm if the first two issues happen only in the message_table branch? If they happen in the main branch, please open another issue for that.

velecto1 commented 1 year ago

Hi, sorry for the delay; I didn't have the old backup available immediately.

1) Nope, the "This message is not supported" thing does not display in the resulting HTML when converting an old database. The chat is actually complete, so maybe it is the "message" saying: "Messages and calls are end-to-end encrypted..."

2) The 'main' branch doesn't have the '-s' flag available at all.

KnugiHK commented 1 year ago

Hi, sorry for the delay; I didn't have the old backup available immediately.

1. Nope, the "This message is not supported" thing does not display in the resulting HTML when converting an old database. The chat is actually complete, so maybe it is the "message" saying: "Messages and calls are end-to-end encrypted..."

2. The 'main' branch doesn't have the '-s' flag available at all.
  1. I will look into it.
  2. Oh, yes. -s is not released yet.
Ahmnonymous commented 1 year ago

@velecto1

I still have three issues, though:

* All exported conversations start with "This message is not supported".
* The flag "-s" seems not to be working.
* I've realized that it would also be helpful for me to split the result into multiple HTML files see_no_evil.

Can you confirm if the first two issues happen only in the message_table branch? If they happen in the main branch, please open another issue for that.

Hello there u can contact me I can help u out but in private !

KnugiHK commented 1 year ago

@velecto1

I still have three issues, though:

* All exported conversations start with "This message is not supported".
* The flag "-s" seems not to be working.
* I've realized that it would also be helpful for me to split the result into multiple HTML files see_no_evil.

Can you confirm if the first two issues happen only in the message_table branch? If they happen in the main branch, please open another issue for that.

Hello there u can contact me I can help u out but in private !

Appreciated! Feel free to poke me around in the Matrix network: https://matrix.to/#/!KtGjfSytJvzxewsXVd:matrix.org

lordfeck commented 1 year ago

I've had the same problem. I was able to extract a Whatsapp DB from Android using the extractor tool and noticed it only has the 'message' table. Will give the test branch a try and see how it goes.

KnugiHK commented 1 year ago

I've had the same problem. I was able to extract a Whatsapp DB from Android using the extractor tool and noticed it only has the 'message' table. Will give the test branch a try and see how it goes.

This issue will likely happen to more and more people in the future as WhatsApp updates more people's database schema.

flansch commented 1 year ago

I just used the code from the message_table branch and was able to successfully export my Whatsapp database. I really appreciate your effort for this useful tool. The only thing I noticed in the resulting files is, that in group chats the name of the group is shown instead of the name of the person writing the message.

KnugiHK commented 1 year ago

I just used the code from the message_table branch and was able to successfully export my Whatsapp database. I really appreciate your effort for this useful tool. The only thing I noticed in the resulting files is, that in group chats the name of the group is shown instead of the name of the person writing the message.

Thanks for reporting that, I can reproduce the problem.

Edit: @flansch The problem should be fixed with https://github.com/KnugiHK/Whatsapp-Chat-Exporter/commit/4cb4ac3e7b96595248a71d0232284b86be0cde96. The commit is presented in message_table branch, see if the problem solved with the commit.

flansch commented 1 year ago

I updated to the latest version of the message_table branch and can confirm that the error is gone, thank you for that. In the meantime I had another issue: When opening the files in Safari on Mac the encoding is not detected properly and emojis and german umlauts are not displayed correctly. By adding <meta charset="UTF-8"> to the head section of one of the files, I was able to fix that. Could you please automatically add this tag.