gtadiparthi / whatsapp-parser-lite

Parses WhatsApp Chat logs
MIT License
15 stars 11 forks source link

errors while executing #1

Open 5ll opened 8 years ago

5ll commented 8 years ago

Hi,

I get the following error when trying to parse my chatlog: Traceback (most recent call last): File "parse_whatsapp.py", line 25, in parse_whatsapp() File "parse_whatsapp.py", line 16, in parse_whatsapp c.feed_lists() File "/home/5ll/whatsappparser/whatsapp-parser-lite-master/transcript.py", line 73, in feed_lists self.datelist.append(prevRawDate) UnboundLocalError: local variable 'prevRawDate' referenced before assignment

I don't really understand it. Can you give me a hint? Thanks!

gtadiparthi commented 8 years ago

Can you send me a sample of your input transcript by anonymizing and masking the real text with some garbled text so that I can reproduce your error?

The code seems to work with all the combinations of whatsapp chat transcript that I tested.

5ll commented 8 years ago

Hi,

yes I will provide you with a sample-file. What I found out till now: The parser (transcript.py) breaks at the date formats not following the mm/dd/yyyy format e. g. the German format dd.mm.yyyy. Also it seems that it does not detect messages with line breaks in it correctly: it detects them as two messages (if I change the date format it somehow / sometimes works and then shows this behaviour). The follwing script seems to get the date-format right: https://github.com/ravikiranj/whatsapp-chat-analysis/ Sadly my python skills are not enough to really change / improve your script, but I try to provide as much input as possible. short2.txt

5ll commented 8 years ago

This regexp should select both our cases and group the username as a own caputre-group, maybe you can use it for selecting between messages an caputre \n in messages correctly [0-9]{1,2}(.|\/)[0-9]{1,2}(.|\/)[0-9]{1,2},\s[0-9]{1,2}:[0-9]{2}(:[0-9]{2}\s(PM|AM):)?\s(-\s)?(\w+):\s

Am 2016-06-09 02:52, schrieb gtadiparthi:

Can you send me a sample of your input transcript by anonymizing and masking the real text with some garbled text so that I can reproduce your error?

The code seems to work with all the combinations of whatsapp chat transcript that I tested.

You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub [1], or mute the thread [2].

*

Links:

[1]

https://github.com/gtadiparthi/whatsapp-parser-lite/issues/1#issuecomment-224774158 [2]

https://github.com/notifications/unsubscribe/AH9NgEMCJIlyKKeh0qoALdCc7M4AJsUbks5qJ2PEgaJpZM4IwBFQ

gtadiparthi commented 8 years ago

Ah, I didn't realize that whatsapp uses different formats in different countries. Thanks for the regex idea.

If you implemented it, feel free to push the updates and I will accept it.

5ll commented 8 years ago

No I didn't implement it, because my python skills are not sufficient. And I dont have time to dig into your code, sorry!

I thought about writing a short converterscript, translating my time format to "your" time format. It hink it should be feasible with something like that:

datetime.datetime.strptime(date_string, format1).strftime(format2) (from: h**p://stackoverflow.com/questions/2265357/parse-date-string-and-change-format)

If I get it to work, I will mail it to you!

Ah, I didn't realize that whatsapp uses different formats in different countries. Thanks for the regex idea.

If you implemented it, feel free to push the updates and I will accept it.

You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub [1], or mute the thread [2].

*

Links:

[1]

https://github.com/gtadiparthi/whatsapp-parser-lite/issues/1#issuecomment-225436723 [2]

https://github.com/notifications/unsubscribe/AH9NgH365WfT0VHUG0rP--bHFboBHwCsks5qLBKggaJpZM4IwBFQ