JBGruber / rwhatsapp

An R package for working with WhatsApp data 💬
94 stars 19 forks source link

Firstly, thanks for this package - it has literally saved me and my future Phd #11

Closed katielou1204 closed 4 years ago

katielou1204 commented 4 years ago

When using my own txt file the number of messages plot always shows NA - how did you get around this - as it seems this doesn't happen to your data?

Also, I think I really good function would be the average number of words each person says in a message - after all it might seem as though I have sent a bunch more messages but they could be one worded.

Just an idea.

JBGruber commented 4 years ago

Hi @katielou1204, cool title, it made my day 😃! I'm happy I could help and hope to see some research in the future with the package.

To your problem: you are right! I noticed this a while ago but was in a rush and so removed the NA author but didn't show how in the vignette/readme. I just added the commands there.

The problem is that info messages from WhatsApp like "Messages to this group are now secured with end-to-end encryption. Tap for more info" show up with author NA. Simply remove those immediately after reading in your history with:

library("dplyr")
chat <- rwa_read("/home/johannes/WhatsApp Chat.txt") %>% 
  filter(!is.na(author)) # remove messages without author

So thanks for pointing this out.

I like your idea to calculate the average number of words each person! I'm not sure though if it makes sense as a function. I'll think about it. Here is how you can calculate it:


library("rwhatsapp")
library("dplyr")
library("stringi")
chat <- rwa_read("/home/johannes/WhatsApp Chat.txt") %>% 
  filter(!is.na(author)) # remove messages without author

chat %>%
  mutate(num_words = stri_count_words(text),
         author = as.character(author)) %>% 
  group_by(author) %>% 
  summarise(mean_num_words = mean(num_words))
#> # A tibble: 4 x 2
#>   author          mean_num_words
#>   <chr>                    <dbl>
#> 1 Alexandra Ils             6.73
#> 2 Artur Kunst               5.50
#> 3 Erika Ils                 6.01
#> 4 Johannes Gruber           7.23

Created on 2020-01-25 by the reprex package (v0.3.0)

katielou1204 commented 4 years ago

Hi Johannes,

Thanks for the reply and the help with the code - I’ve only just seen it.

I’ll be sure to reference you and the package in my PhD (in about 4 years) 😊

Best Katie

On 25 Jan 2020, at 10:12, Johannes Gruber notifications@github.com wrote:

 Hi @katielou1204, cool title, it made my day 😃! I'm happy I could help and hope to see some research in the future with the package.

To your problem: you are right! I noticed this a while ago but was in a rush and so removed the NA author but didn't show how in the vignette/readme. I just added the commands there.

The problem is that info messages from WhatsApp like "Messages to this group are now secured with end-to-end encryption. Tap for more info" show up with author NA. Simply remove those immediately after reading in your history with:

library("dplyr") chat <- rwa_read("/home/johannes/WhatsApp Chat.txt") %>% filter(!is.na(author)) # remove messages without author So thanks for pointing this out.

I like your idea to calculate the average number of words each person! I'm not sure though if it makes sense as a function. I'll think about it. Here is how you can calculate it:

library("rwhatsapp") library("dplyr") library("stringi") chat <- rwa_read("/home/johannes/WhatsApp Chat.txt") %>% filter(!is.na(author)) # remove messages without author

chat %>% mutate(num_words = stri_count_words(text), author = as.character(author)) %>% group_by(author) %>% summarise(mean_num_words = mean(num_words))

> # A tibble: 4 x 2

> author mean_num_words

>

> 1 Alexandra Ils 6.73

> 2 Artur Kunst 5.50

> 3 Erika Ils 6.01

> 4 Johannes Gruber 7.23

Created on 2020-01-25 by the reprex package (v0.3.0)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

JBGruber commented 4 years ago

Great! Can this be closed then?

katielou1204 commented 4 years ago

Yes :) Thank you