Please fill in the form to submit an exercise question. Please state the question under Question section. Please try to be as specific as possible when describing the problem. In hint chunk, you can provide a statement (which function to use, or which columns to join, etc.) or you can provide first 1-2 lines of expected result. Please refer to Github markdown table instructions if you need to include a table.
In solution chunk please provide the code to solve the problem. Your solutions should be runnable in anybody's computer. Thus, please don't include file locations in your own computer while importing data. The data should be coming from a R package or from an online source.
Question
by using sentiment analysis methods, find the most frequently used words and most important words that forms the project abstract.
my_project <- c("ABSTRACT",
"Heavy metals are preferred as metals whose are heavier five times than water molecule and show toxemic effect event at low concentrations.",
"These metals consist of zinc, silver, lead, iron, chromium, copper, arsenic, cadmium and nickel metals.",
"The pollution in drinkable water caused by these heavy metals possesses a great threat to the environment, peoples, and other living organisms in recent years.",
"This type of pollution can be observed significantly in areas where industrialization is intense.",
"A lot of factories in various sectors such as food, pharmaceutical, chemistry, cosmetic and beverage, consumes a big amount of water during their productions and the used water, unless handled carefully, mixed with crude water in river, lakes, seas etc. in turn, causing pollution on a big scale of water.",
"Polluted waters cause various diseases like cancer Alzheimer, Parkinson and heart dysfunctions mainly.",
"Also, underground water is commonly used for agriculture.",
"Therefore, heavy metals can be taken into the body by consuming food watered with contaminated water, since these foods took these heavy metals along with water.")
my_project
About images: If your question or solution contains an image, please attach necessary images by dragging them here or copy/pasting from clipboard. After the upload, a markdown style link to image will be generated for you.
Additional information
a specific result is not required, a brief analysis of results will be enough.
Originality
Please mark relevant information with x, (ex. [x])
Is this question
[ x] Original
[ ] Inspired
[ ] Paraphrased or copied
If you select Inspired or Paraphrased please provide the links in markdown format ( [link](http://example.com) ). Please provide all relevant links. You can refer to DataCamp course pages if you're inspired by them.
Difficulty Level
According to you, what is the level of difficulty of the question (note: this can be modified by instructor after submission)
[ ] Easy / Beginner (using a single command or concept is enough to solve the question)
[ x] Intermediate (combining multiple commands, concepts is needed to solve the question)
[ ] Difficult (combining multiple commands with non-default options and looking for additional information online might be needed)
Tags (optional)
Please provide comma separated list of dplyr verbs (e.g. summarize, left join) or concepts (e.g. text mining) that you think are relevant with question
`gruop_by, count, unnest_tokens, filter, sentiment analysis, bind_tf_idf, bigram
Before submitting
Please click Preview and preview your submission and check if it's rendered correctly
Please fill in the form to submit an exercise question. Please state the question under
Question
section. Please try to be as specific as possible when describing the problem. Inhint
chunk, you can provide a statement (which function to use, or which columns to join, etc.) or you can provide first 1-2 lines of expected result. Please refer to Github markdown table instructions if you need to include a table. Insolution
chunk please provide the code to solve the problem. Your solutions should be runnable in anybody's computer. Thus, please don't include file locations in your own computer while importing data. The data should be coming from a R package or from an online source.Question
by using sentiment analysis methods, find the most frequently used words and most important words that forms the project abstract.
totalword_project <- word_project %>% mutate(total_word = sum(n)) totalword_project
progect_A <-left_join(word_project, totalword_project, by = c("source", "word", "n"))
About images: If your question or solution contains an image, please attach necessary images by dragging them here or copy/pasting from clipboard. After the upload, a markdown style link to image will be generated for you.
Additional information
a specific result is not required, a brief analysis of results will be enough.
Originality
Please mark relevant information with x, (ex.
[x]
)Is this question
If you select
Inspired
orParaphrased
please provide the links in markdown format ([link](http://example.com)
). Please provide all relevant links. You can refer to DataCamp course pages if you're inspired by them.Difficulty Level
According to you, what is the level of difficulty of the question (note: this can be modified by instructor after submission)
Tags (optional)
Please provide comma separated list of dplyr verbs (e.g.
summarize
,left join
) or concepts (e.g.text mining
) that you think are relevant with question`gruop_by
,count
,unnest_tokens
,filter
,sentiment analysis
,bind_tf_idf
,bigram
Before submitting
Preview
and preview your submission and check if it's rendered correctly