JBGruber / rwhatsapp

An R package for working with WhatsApp data 💬
95 stars 19 forks source link

Unidentified Emojis #5

Closed mapale1 closed 4 years ago

mapale1 commented 5 years ago

Many of the emojis are not identified, for example, <U + 0001F913> <U + 0001F914> <U + 0001F644>

JBGruber commented 5 years ago

Thanks for reporting. However, I can't reproduce your issue. Assuming the emojis below are the one you're having problems with, they parse without a problem.

x <- c(
  "05/11/2019, 08:52 - Johannes Gruber: Is there a problem with this emoji?",
  "05/11/2019, 08:52 - Johannes Gruber: 🤓",
  "05/11/2019, 09:00 - Johannes Gruber: 🤔",
  "05/11/2019, 09:00 - Johannes Gruber: 🙄"
)

library(tidyr)
library(dplyr)

rwhatsapp::rwa_read(x) %>% 
  unnest(c(emoji, emoji_name)) %>% 
  left_join(rwhatsapp::emojis, by = "emoji")
#> # A tibble: 3 x 8
#>   time                author  text  source emoji emoji_name name  hex_runes
#>   <dttm>              <fct>   <chr> <chr>  <chr> <chr>      <chr> <chr>    
#> 1 2019-11-05 08:52:49 Johann… 🤓    text … 🤓    nerd face  nerd… 1F913    
#> 2 2019-11-05 09:00:49 Johann… 🤔    text … 🤔    thinking … thin… 1F914    
#> 3 2019-11-05 09:00:49 Johann… 🙄    text … 🙄    face with… face… 1F644

Created on 2019-11-05 by the reprex package (v0.3.0)

Maybe you can share an excerpt of your chat log with which you have the problems? I'm matching emojis against a data.frame of known emojis. So it is prone to error theoretically as I couldn't find the exact Unicode characters Whatsapp is using.

shbkukuk commented 5 years ago

many time i tried to work this project for my data but i cant get it on succesfull. how can i use this code blog. thanks for regards

JBGruber commented 5 years ago

Please give a more detailed description of what you are trying to do and what isn't working for you. Ideally provide a reproducible example of what exactly is causing an error for you (e.g., with this) .

pintodossantos commented 4 years ago

Interestingly I seem to have a similar problem, but there are differences between the output in an Rmarkdown, ggplot and console output... Rmarkdown:

Bildschirmfoto 2019-11-13 um 13 01 42

Console:

Bildschirmfoto 2019-11-13 um 13 02 36

GGplot:

Bildschirmfoto 2019-11-13 um 13 03 43
JBGruber commented 4 years ago

Hi @pintodossantos.This is an issue with the font ggplot2 is using by default, not with rwhatsapp, which seems to do fine here. Are you using a Mac? I outlined a solution here.

pintodossantos commented 4 years ago

Thanks! I understand the ggplot part, but the Rmarkdown?

JBGruber commented 4 years ago

Okay, now I get it. I don't really know a lot about how RStudio is displaying RMarkdown code chunks. I tried this on my machine and it works fine. On rstudio.cloud I get the same result as you. In both cases, the output looks fine when rendered to html. I guess this is another font or encoding issue. But the issue is only displaying the emojis. The data itself is correct.

What operating system do you use? And do you have the newest version of RStudio?

pintodossantos commented 4 years ago
> version
               _                           
platform       x86_64-apple-darwin15.6.0   
arch           x86_64                      
os             darwin15.6.0                
system         x86_64, darwin15.6.0        
status                                     
major          3                           
minor          6.1                         
year           2019                        
month          07                          
day            05                          
svn rev        76782                       
language       R                           
version.string R version 3.6.1 (2019-07-05)
nickname       Action of the Toes

and

> RStudio.Version()$version
[1] ‘1.2.5001’
JBGruber commented 4 years ago

I don't think this issue has anything to do with rwhatsapp. Rather it is a problem with the printing method in RStudio on some systems.

If you explicitly call a different printing method it should work for tables as well. For example something like:

emojis <- rwhatsapp::rwa_read(x) %>% 
  unnest(c(emoji, emoji_name)) %>% 
  left_join(rwhatsapp::emojis, by = "emoji")

emojis %>% 
  knitr::kable()

or:

emojis %>% 
  DT::datatable()