[BUG] Word Use Difference - does not load up (russian text) parser from VK

BryceStevenWilley / visioning_texts

A D3 project that locally visualizes your messages from Signal or Whatsapp

GNU General Public License v3.0

37 stars 6 forks source link

[BUG] Word Use Difference - does not load up (russian text) parser from VK #36

Open sukhinsergey opened 4 years ago

sukhinsergey commented 4 years ago

Hi Bryce!

Thank you for doing such tool. Saw it on reddit and trying to implement.

But my text parsed from VK messenger (from HTML to txt) can be processed by several parts of your program, I.e. Word Use Difference

But when I try doing the same thing for WhatsApp convo, it works just fine. Format is the same for both texts.

Example of text below:

[25.03.2018, 10:53:08] Вы: экивоки знаешь?) [25.03.2018, 10:52:58] Ирен: я очень круто играю в активити

text is not recognisable, but emojis are alright.

Kindly ask for you help!

sukhinsergey commented 4 years ago

Made it work with additional volume per file (almost). But now it won't load up since the file size (2years worth of conversation) is more than 20 mb. Did you guys have similar problem and how you solved it?

I run script via web or via MAMP, since crashed...

sukhinsergey commented 4 years ago

update.

parsed has been updated, and now the word use gives all the words, w/o any filtering

is there any option to limit displaying texts? set the minimum words used perhaps?

BryceStevenWilley commented 3 years ago

Hey @sukhinsergey! Sorry for not responding, for a literal year, I think I missed this notification originally.

The minimum word count is https://github.com/BryceStevenWilley/visioning_texts/blob/253d2ffabdb3872850e40c2c6ffc1178c0f2dcf6/src/math.js#L387, currently set to 10. It was definitely chosen based on my own conversations, that were pretty small in hindsight. Raising that should help you.