BryceStevenWilley / visioning_texts

A D3 project that locally visualizes your messages from Signal or Whatsapp
GNU General Public License v3.0
37 stars 6 forks source link

Smileys not shown correctly[BUG] #27

Open biejay opened 4 years ago

biejay commented 4 years ago

The smileys at the end form well so far but i guess when i upload a huge file then they get messed up. I will attach a demo picture at the end.

  1. When i hover above the smileys and summarize them, then they dont sum up to the actual count i got in my file. (e.g. shows me about 300 hearts (❤) but i can find ~3000 in the .txt)

  2. Also they seem to be not centered (imo the most used should be centered with less used moved to the outer space?)

No logs/errors in console. Loaded a 14mb file=200.000 messages (takes roughly 10min to compute)

Unbenannt

BryceStevenWilley commented 4 years ago

Great report! The size of the emojis might be an issue with the grapheme library that I'm using, or the regex that I use to check for emojis in texts.

The largest aren't centered right now because under the hood it's a d3 force-layout with collisions. I might be able to change where the smilies are attracted to, so I'll take a look at that.

As far as the box being too small for all of the emoji's, this is probably the most difficult. It's hard to get the exact size of the each emoji in pixels, and even harder to figure out how big the box needs to be to fit them all. The heuristic right now is 50 * sqrt(# of emojis), it could probably be better.

And yeah, it's not the fastest, especially with that much data! :sweat_smile: I'll take a look at the text algorithms to make sure they aren't sub-optimal, but I'm not a javascript expert, so any suggestions as how to optimize the code are welcome!

BryceStevenWilley commented 4 years ago

Found an issue in the way I was counting the emoji, it should be a bit better now. Sometimes with multiple emoji in row, grapheme-splitter still can't parse them correctly: Failed emoji:😘 (2) 56856, which is not the right code for that guy. It's confusing, but at least some of the codes are right now. Will look at this again later.