alperyilmaz / dav-exercises

Exercise questions submitted by Data Analysis and Visualization with R course students at YTU
GNU General Public License v3.0
1 stars 2 forks source link

Tom Sawyer #91

Open serappfd opened 6 years ago

serappfd commented 6 years ago

Question

Find the 10 most used words and show them on the chart

image


library(gutenbergr)
library(tidytext)
library(dplyr)
library(tidyverse)
library(widyr)
library(ggplot2)
tom_sawyer <- gutenberg_download(c(74))
tom_sawyer

tom_sawyer %>%
  mutate(linenumber = row_number()) %>%
  ungroup() %>%
  unnest_tokens(word, text) %>%
  mutate(section = row_number() %/% 30) %>%
  count(word, sort=TRUE) %>%
  filter(!word %in% stop_words$word) %>%
  top_n(10) %>%
  ungroup() %>%
  mutate(word = reorder(word,n)) %>%
ggplot(aes(word, n, fill = word)) +
  geom_col(show.legend = FALSE) +
  labs(x = NULL, y = "n") +
  coord_flip()

Additional information

Originality

Is this question

Difficulty Level

Tags

tidytext, dplyr, ggplot, tidyverse, widyr