KnugiHK / WhatsApp-Chat-Exporter

A customizable Android and iOS/iPadOS WhatsApp database parser that will give you the history of your WhatsApp conversations in HTML and JSON. Android Backup Crypt12, Crypt14, Crypt15, and new schema supported.
https://wts.knugi.dev/
MIT License
518 stars 76 forks source link

Crash when content id is not found #61

Closed andrp92 closed 4 weeks ago

andrp92 commented 10 months ago

When you have Whatsapp and migrate to Whatsapp business or the opposite, some records will lose its ids and python doesn't like it here.

Traceback (most recent call last):65)
File "/opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/runpy.py", line 198, in _run_module_as_main return _run_code(code, main_globals, None,^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/runpy.py", line 88, in _run_code exec(code, run_globals)
File "/.vscode/extensions/ms-python.python-2023.14.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 39, in <module> cli.main()
File "/.vscode/extensions/ms-python.python-2023.14.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main run()
File "/.vscode/extensions/ms-python.python-2023.14.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file runpy.run_path(target, run_name="__main__")
File "/.vscode/extensions/ms-python.python-2023.14.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path return _run_module_code(code, init_globals, run_name, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.vscode/extensions/ms-python.python-2023.14.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code _run_code(code, mod_globals, init_globals,
File "/.vscode/extensions/ms-python.python-2023.14.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code exec(code, run_globals)
File "/WhatsApp-Chat-Exporter/Whatsapp_Chat_Exporter/__main__.py", line 382, in <module> main() File "/Whatsapp_Chat_Exporter/__main__.py", line 305, in main messages(db, data, args.media)
File "/dev/itunes-backup/.venv/lib/python3.11/site-packages/Whatsapp_Chat_Exporter/extract_iphone.py", line 93, in messages path = f'{media_folder}/Media/Profile/{_id.split("@")[0]}' ^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'split'
KnugiHK commented 9 months ago

Hi. Could you post the value of _id before the program crash?

LoSunny commented 7 months ago

Sorry for overtaking this issue, but I have just tried and face the same bug. Device: iOS WhatsApp / Business: WhatsApp

The _id is indeed NULL for some reason, I have opened an sql editor and try to run the command, adding the WHERE _id is NULL clause to check if it is an issue with the python sqlite reader or the query statement. Here is a photo for reference: https://imgur.com/a/tYDeXLm

The issue seems to be also affecting the media function too, apart from the messages function andrp92 faced above

To fix the issue temporary, I have currently added the WHERE clause to both statement, i.e. Messages query

    c.execute("""SELECT COALESCE(ZFROMJID, ZTOJID) as _id,
                        ZWAMESSAGE.Z_PK,
                        ZISFROMME,
                        ZMESSAGEDATE,
                        ZTEXT,
                        ZMESSAGETYPE,
                        ZWAGROUPMEMBER.ZMEMBERJID,
                        ZMETADATA,
                        ZSTANZAID
                 FROM ZWAMESSAGE
                    LEFT JOIN ZWAGROUPMEMBER
                        ON ZWAMESSAGE.ZGROUPMEMBER = ZWAGROUPMEMBER.Z_PK
                    LEFT JOIN ZWAMEDIAITEM
                        ON ZWAMESSAGE.Z_PK = ZWAMEDIAITEM.ZMESSAGE
                 WHERE _id is NOT NULL;""")

Media query

    c.execute("""SELECT COALESCE(ZWAMESSAGE.ZFROMJID, ZWAMESSAGE.ZTOJID) as _id,
                        ZMESSAGE,
                        ZMEDIALOCALPATH,
                        ZMEDIAURL,
                        ZVCARDSTRING,
                        ZMEDIAKEY,
                        ZTITLE
                 FROM ZWAMEDIAITEM
                    INNER JOIN ZWAMESSAGE
                        ON ZWAMEDIAITEM.ZMESSAGE = ZWAMESSAGE.Z_PK
                 WHERE ZMEDIALOCALPATH IS NOT NULL AND _id is NOT NULL
                 ORDER BY _id ASC""")

However, I could not verify if it causes any corruption due to my message history being insanely huge

(venv) sunnylo@SunnyLos-Laptop % ./venv/bin/wtsexporter -i -b ~/Library/Application\ Support/MobileSync/Backup/asdf
WhatsApp directory already exists, skipping WhatsApp file extraction.
Processing contacts...(548)
Processing messages...(245452/245452)
Processing media...(128019/128019)
Processing vCards...(51/51)
Generating chats...(564/564)
Copying media directory...

Everything is done!

Hope it will be fixed soon, thanks for the great library, love it

KnugiHK commented 7 months ago

Hi @LoSunny. Thanks for your workaround. Since _id served as an identifier for each chat, it is crucial for the whole operation. Therefore, your workaround will most likely be accepted as a fix unless a better solution pops up. I will work on it soon.

KnugiHK commented 7 months ago

Good news! A more reliable way to determine which chat the message belongs to is shipped with the latest fix, 3847836. @LoSunny @andrp92 Can you try the fix in dev branch and see if it works as expected?

LoSunny commented 7 months ago

Yep it "works"

sunnylo@SunnyLos-Laptop WhatsApp-Chat-Exporter % wtsexporter -i -o result2 -b ~/Library/Application\ Support/MobileSync/Backup/asdf
WhatsApp directory already exists, skipping WhatsApp file extraction.
Processing contacts...(548)
Processing messages...(245452/245452)
Processing media...(128019/128019)
Processing vCards...(51/51)
Generating chats...(548/548)
Copying media directory...

Everything is done!

However, it seems to have generated less chats than before, from 564 to 548? I have ran a diff to see the file difference

(part of the result)
Only in result: 1545670533.html
Only in result: 1620178621.html
Only in result: 1628765705.html
Only in result: 1630636817.html
Only in result: 1630920124.html
Only in result: 1634260290.html
Only in result: 1635758267.html
Only in result: 1639734958.html
Only in result: 1641020304.html
Only in result: 1648629655.html
Only in result: 1661274036.html
Only in result: 1661601743.html
Only in result: 1662791013.html
Only in result: 1671960498.html
Only in result: 1675172235.html
Only in result: status-.html

On the other hand, the dev branch seems to have solve some of the errors that displayed before when I ran a diff -r command

$ diff -r result result2
161c195
<                                   <p>Not supported WhatsApp internal message</p>
---
>                                   <p>The group name changed to asdf</p>
KnugiHK commented 7 months ago

However, it seems to have generated less chats than before, from 564 to 548?

I believe 548 is the correct number. AFAIK, each message should be mapped to the table ZWACHATSESSION by the column ZWAMESSAGE.ZCHATSESSION. Hence, it should be accurate. You can see if messages in chats that only appear before the fix are exported into another HTML file. (sor for 1999)

On the other hand, the dev branch seems to have solve some of the errors that displayed before when I ran a diff -r command

Well😄That's unintended.

LoSunny commented 7 months ago

I believe 548 is the correct number. AFAIK, each message should be mapped to the table ZWACHATSESSION by the column ZWAMESSAGE.ZCHATSESSION. Hence, it should be accurate. You can see if messages in chats that only appear before the fix are exported into another HTML file. (sor for 1999)

After I double check with the messages, it indeed exists in other group chat, so yeah it should be correct now. hou yea 🤩

KnugiHK commented 7 months ago

For example:

Assuming the user 85200000000@s.whatsapp.net created a broadcast channel 1111111111@broadcast and you were being added to the channel. And you do not have any PM message with 85200000000@s.whatsapp.net.

Before the fix:

After the fix:

LoSunny commented 7 months ago

okie. learn a lot of new stuff tdy. Btw, for both versions the timestamp shown in the HTML seems to be mostly the same for the same date. I.e. the hour:minute shown for both pm and grp chat will be the same for the same date. Is it a bug?

KnugiHK commented 7 months ago

okie. learn a lot of new stuff tdy. Btw, for both versions the timestamp shown in the HTML seems to be mostly the same for the same date. I.e. the hour:minute shown for both pm and grp chat will be the same for the same date. Is it a bug?

If it is related to #64, could you provide some more details & screenshots to that issue? Also, as I mentioned in another issue, I don't currently have access to the latest iOS WhatsApp, so its difficult for me to reproduce the problem and debug that.

KnugiHK commented 7 months ago

I will leave this open until next release.

KnugiHK commented 4 weeks ago

Released in 0.10.0.