Thankyouwords - Githubissues

espinielli commented 5 years ago

Added thank you data set and updated the documentation/vignette accordingly. While running the vignette example I see that some words (even from love dataset) do not render and are just plotted as a box. I do not know what this is due to (UTF8?)...

I have added the language name in the lang_stats because this could be useful to filter the subset of languages you want to target maybe according to your audience...

espinielli commented 5 years ago

Here is a snapshot of rendering thankyou_words_small as in the first word cloud example in vignette

ggplot(thankyou_words_small, aes(label = word)) +
     geom_text_wordcloud() +
     theme_minimal()

screen shot 2018-12-01 at 4 23 30 pm

espinielli commented 5 years ago

So the problem is with (at least) Chinese characters

thankyou_words %>% filter(iso_639_3 == "zho") -> w
ggplot(w, aes(label = word)) +
     geom_text_wordcloud() +
     theme_minimal()

screen shot 2018-12-01 at 4 30 07 pm

Do I need to set some options??

espinielli commented 5 years ago

I have found this article on RStudio Comunity but not yet a solution...

espinielli commented 5 years ago

It looks like a platform issue...if I try the default example in the vignette I get same issue:

library(ggwordcloud)
#> Loading required package: ggplot2
data("love_words_small")
set.seed(42)
ggplot(love_words_small, aes(label = word, size = speakers)) +
  geom_text_wordcloud() +
  scale_size_area(max_size = 24) +
  theme_minimal()

screen shot 2018-12-01 at 4 50 37 pm

Platform specific? (I am on OSX)

lepennec commented 5 years ago

This is indeed an encoding nightmare! With linux and windows, everything seems to work when executed from the rstudio console. I have an issue if I use the pdf() device... Even funnier, if I build the pkgdown site with build_site() I have this issue but if I rebuild the reference with build_reference() no.

Character encoding is a quite complex thing!!!

espinielli commented 5 years ago

I am puzzled too. On Mac/RStudio if I run View(thankyou_words) I can see all Chinese, Arab,... characters That I thought was a good enough proof that I did save the words correctly. But as from the link I cited, life is much more complicated when graphics are involved. Maybe generating SVG?

On Mon, Dec 3, 2018, 15:45 Erwan Le Pennec <notifications@github.com wrote:

This is indeed an encoding nightmare! With linux and windows, everything seems to work when executed from the rstudio console. I have an issue if I use the pdf() device... Even funnier, if I build the pkgdown site with build_site() I have this issue but if I rebuild the reference with build_reference() no.

Character encoding is a quite complex thing!!!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/lepennec/ggwordcloud/pull/4#issuecomment-443734467, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2bLGo3kxJdTlXkvFPB7aZpqfY5lxtlks5u1TklgaJpZM4Y8BHD .

lepennec / ggwordcloud

Thankyouwords #4