MasterScrat / Chatistics

💎 Python scripts to parse Messenger, Hangouts, WhatsApp and Telegram chat logs into DataFrames.
https://masterscrat.github.io/Chatistics/
MIT License
928 stars 100 forks source link

Wrong distribution on breakdown plot #40

Open mustafababil opened 4 years ago

mustafababil commented 4 years ago

I have only exported 1 Whatsapp chat, and progressed to plot its breakdown.

First messages are in December 2017 (total of 30), but their column is also not reserved in plot.

Even though, total number of 783 messages is exchanged in January 2020, the plot doesn't reflect these numbers, and shows like there are messages exchanged in upcoming months of 2020.

My steps:

  1. Export Whatsapp chat from iOS 13 with email.

  2. Put x_chat.txt to /raw_data

  3. Run commands virtualenv chatistics source chatistics/bin/activate pip install -r requirements.txt python parse.py whatsapp --own-name "My Name" python visualize.py breakdown

  4. Result https://i.imgur.com/TVgHaDX.png

Python 3.7.4

What can be the reason? How can I investigate it more and solve it? Thanks.

MasterScrat commented 4 years ago

You can try printing out the messages to standard output:

python export.py -f stdout

Do you see all the messages you'd expect in there?

MasterScrat commented 4 years ago

Might be related to https://github.com/MasterScrat/Chatistics/pull/42

MasterScrat commented 4 years ago

42 has been fixed and merged on master, can you check if it works now @mustafababil ?

mustafababil commented 4 years ago

@MasterScrat I will be checking tomorrow and reporting the results.

mustafababil commented 4 years ago

I am sorry it didn't help. I repeated the steps above. Result is as follows: https://i.imgur.com/KyWNIyz.png

Output of python export.py -f stdout (Sorry I deleted chat content, and lines are messed a bit).

Loaded message count is correct I think, previously it was around 10k. My first message is in December 2017, but the last message is in 17.01.2020 and it is not shown in stdout.

I am using iOS, with Turkish language and The Netherlands as a location.

2020-01-23 10:28:11,524 [INFO ] [utils       ]: Could not find any data for platform telegram
2020-01-23 10:28:11,524 [INFO ] [utils       ]: Reading data for platform whatsapp
2020-01-23 10:28:11,549 [INFO ] [utils       ]: Could not find any data for platform messenger
2020-01-23 10:28:11,549 [INFO ] [utils       ]: Could not find any data for platform hangouts
2020-01-23 10:28:11,551 [INFO ] [utils       ]: Loaded a total of 41,618 messages (0 removed by filters)
          timestamp conversationWithName     senderName  outgoing                                                                                                 text language  platform
2017-12-12 21:41:14                 XXXX           XXXX     False  ‎       tr  whatsapp
2017-12-12 21:41:14                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-12 23:10:16                 XXXX  Me      True                                                                       tr  whatsapp
2017-12-12 23:10:19                 XXXX  Me      True                                                                       tr  whatsapp
2017-12-12 23:10:24                 XXXX  Me      True                                                                       tr  whatsapp
2017-12-12 23:10:38                 XXXX  Me      True                                                                       tr  whatsapp
2017-12-12 23:10:42                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-12 23:10:48                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-12 23:10:52                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-12 23:10:59                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-12 23:11:02                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-12 23:11:08                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-12 23:11:24                 XXXX  Me      True                                                               😂       tr  whatsapp
2017-12-12 23:11:27                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-12 23:11:37                 XXXX  Me      True                                                               😀       tr  whatsapp
2017-12-12 23:11:41                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-12 23:11:45                 XXXX  Me      True                                                                       tr  whatsapp
2017-12-12 23:11:47                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-12 23:11:54                 XXXX  Me      True                                                                       tr  whatsapp
2017-12-12 23:11:59                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-12 23:12:22                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-12 23:12:30                 XXXX  Me      True                                                                       tr  whatsapp
2017-12-12 23:12:35                 XXXX  Me      True                                                                       tr  whatsapp
2017-12-12 23:12:38                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-12 23:12:39                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-12 23:12:39                 XXXX  Me      True                                                                       tr  whatsapp
2017-12-12 23:12:43                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-12 23:12:48                 XXXX  Me      True                                                                       tr  whatsapp
2017-12-12 23:12:48                 XXXX           XXXX     False                                                                       tr  whatsapp
2017-12-19 20:08:27                 XXXX  Me      True                                                               tr  whatsapp
2018-02-27 20:42:28                 XXXX  Me      True                                                               🖕       tr  whatsapp
2018-02-27 20:43:25                 XXXX  Me      True                                                               🖕       tr  whatsapp
2018-08-03 12:26:42                 XXXX  Me      True                                                                       tr  whatsapp
2018-08-03 12:34:15                 XXXX           XXXX     False                                                                       tr  whatsapp
2018-08-03 12:34:23                 XXXX           XXXX     False                                                                       tr  whatsapp
2018-08-03 12:34:27                 XXXX           XXXX     False                                                               ðŸĪŠ       tr  whatsapp
2018-08-03 12:43:59                 XXXX  Me      True                                                               😃       tr  whatsapp
2018-08-03 12:44:16                 XXXX           XXXX     False                                                                       tr  whatsapp
2018-08-03 12:44:20                 XXXX           XXXX     False                                                                       tr  whatsapp
2018-08-03 12:44:25                 XXXX           XXXX     False                                                                       tr  whatsapp
2018-08-03 12:44:31                 XXXX           XXXX     False                                                                       tr  whatsapp
2018-08-03 12:44:35                 XXXX           XXXX     False                                                                       tr  whatsapp
2018-08-03 12:44:43                 XXXX           XXXX     False                                                                       tr  whatsapp
2018-08-03 12:44:46                 XXXX           XXXX     False                                                                       tr  whatsapp
2018-08-03 12:44:52                 XXXX           XXXX     False                                                               ðŸ˜Ĩ       tr  whatsapp
2018-08-03 12:45:46                 XXXX  Me      True                                                                       tr  whatsapp
2018-08-03 12:45:51                 XXXX  Me      True                                                                       tr  whatsapp
2018-08-03 12:45:55                 XXXX  Me      True                                                                       tr  whatsapp
2018-08-03 12:46:06                 XXXX           XXXX     False                                                                       tr  whatsapp
2018-08-03 12:46:09                 XXXX           XXXX     False                                                                       tr  whatsapp

And here is the first 5 lines of Whatsapp export:

[12.12.2017 21:41:14] XXXX: Chat message
[12.12.2017 21:41:14] XXXX: Chat message
[12.12.2017 23:10:16] Me: Chat message
[12.12.2017 23:10:19] Me: Chat message
[12.12.2017 23:10:24] Me: Chat message
mar-muel commented 4 years ago

Hi @mustafababil! By default stdout only shows the first 50 messages. You can use

python export.py -f stdout -n -1

which prints all messages until the second last one. The reason why we limit this to 50 is so that your Computer doesn't crash if you have > millions of messages :)

Thanks for letting us know if this indeed shows the messages (until the second last one).