Closed katielou1204 closed 4 years ago
Hi @katielou1204, cool title, it made my day 😃! I'm happy I could help and hope to see some research in the future with the package.
To your problem: you are right! I noticed this a while ago but was in a rush and so removed the NA
author but didn't show how in the vignette/readme. I just added the commands there.
The problem is that info messages from WhatsApp like "Messages to this group are now secured with end-to-end encryption. Tap for more info" show up with author NA
. Simply remove those immediately after reading in your history with:
library("dplyr")
chat <- rwa_read("/home/johannes/WhatsApp Chat.txt") %>%
filter(!is.na(author)) # remove messages without author
So thanks for pointing this out.
I like your idea to calculate the average number of words each person! I'm not sure though if it makes sense as a function. I'll think about it. Here is how you can calculate it:
library("rwhatsapp")
library("dplyr")
library("stringi")
chat <- rwa_read("/home/johannes/WhatsApp Chat.txt") %>%
filter(!is.na(author)) # remove messages without author
chat %>%
mutate(num_words = stri_count_words(text),
author = as.character(author)) %>%
group_by(author) %>%
summarise(mean_num_words = mean(num_words))
#> # A tibble: 4 x 2
#> author mean_num_words
#> <chr> <dbl>
#> 1 Alexandra Ils 6.73
#> 2 Artur Kunst 5.50
#> 3 Erika Ils 6.01
#> 4 Johannes Gruber 7.23
Created on 2020-01-25 by the reprex package (v0.3.0)
Hi Johannes,
Thanks for the reply and the help with the code - I’ve only just seen it.
I’ll be sure to reference you and the package in my PhD (in about 4 years) 😊
Best Katie
On 25 Jan 2020, at 10:12, Johannes Gruber notifications@github.com wrote:
 Hi @katielou1204, cool title, it made my day 😃! I'm happy I could help and hope to see some research in the future with the package.
To your problem: you are right! I noticed this a while ago but was in a rush and so removed the NA author but didn't show how in the vignette/readme. I just added the commands there.
The problem is that info messages from WhatsApp like "Messages to this group are now secured with end-to-end encryption. Tap for more info" show up with author NA. Simply remove those immediately after reading in your history with:
library("dplyr") chat <- rwa_read("/home/johannes/WhatsApp Chat.txt") %>% filter(!is.na(author)) # remove messages without author So thanks for pointing this out.
I like your idea to calculate the average number of words each person! I'm not sure though if it makes sense as a function. I'll think about it. Here is how you can calculate it:
library("rwhatsapp") library("dplyr") library("stringi") chat <- rwa_read("/home/johannes/WhatsApp Chat.txt") %>% filter(!is.na(author)) # remove messages without author
chat %>% mutate(num_words = stri_count_words(text), author = as.character(author)) %>% group_by(author) %>% summarise(mean_num_words = mean(num_words))
> # A tibble: 4 x 2
> author mean_num_words
>
> 1 Alexandra Ils 6.73
> 2 Artur Kunst 5.50
> 3 Erika Ils 6.01
> 4 Johannes Gruber 7.23
Created on 2020-01-25 by the reprex package (v0.3.0)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Great! Can this be closed then?
Yes :) Thank you
When using my own txt file the number of messages plot always shows NA - how did you get around this - as it seems this doesn't happen to your data?
Also, I think I really good function would be the average number of words each person says in a message - after all it might seem as though I have sent a bunch more messages but they could be one worded.
Just an idea.