InseadDataAnalytics / INSEADAnalytics

Other
122 stars 1.31k forks source link

Searching for positive and negative words within twitter sentiment analysis #37

Open mrlajoie opened 8 years ago

mrlajoie commented 8 years ago

To conduct positive and negative words within twitter sentiment analysis, I have found the following code. (see sentiment analysis forked on my github or also this website: http://www.r-bloggers.com/twitter-sentiment-analysis-with-r/) However, both include something along this: { and when I attempt to run it, it follows with + and doesn't go forward. What should I do?

CODE:

score.sentiment = function(sentences, pos.words, neg.words, .progress='none')

{

require(plyr)

require(stringr)

we got a vector of sentences. plyr will handle a list

or a vector as an "l" for us

we want a simple array ("a") of scores back, so we use

"l" + "a" + "ply" = "laply":

scores = laply(sentences, function(sentence, pos.words, neg.words) {

# clean up sentences with R's regex-driven global substitute, gsub():

sentence = gsub('[[:punct:]]', '', sentence)

sentence = gsub('[[:cntrl:]]', '', sentence)

sentence = gsub('\\d+', '', sentence)

# and convert to lower case:

sentence = tolower(sentence)

# split into words. str_split is in the stringr package

word.list = str_split(sentence, '\\s+')

# sometimes a list() is one level of hierarchy too much

words = unlist(word.list)

# compare our words to the dictionaries of positive & negative terms

pos.matches = match(words, pos.words)

neg.matches = match(words, neg.words)

# match() returns the position of the matched term or NA

# we just want a TRUE/FALSE:

pos.matches = !is.na(pos.matches)

neg.matches = !is.na(neg.matches)

# and conveniently enough, TRUE/FALSE will be treated as 1/0 by sum():

score = sum(pos.matches) - sum(neg.matches)

return(score)

}, pos.words, neg.words, .progress=.progress )

scores.df = data.frame(score=scores, text=sentences)

return(scores.df)

}

tevgeniou commented 8 years ago

this seems to do fine. However, you may have some issue with the quotes (e.g. you have not closed some quotes, or a quote is single instead of double, etc.

sentences = c("this is a job", "this is a dog", "this is a cat") pos.words = c("this", "is") neg.words = c("a") score.sentiment(sentences , pos.words, neg.words) score text 1 1 this is a job 2 1 this is a dog 3 1 this is a cat

tevgeniou commented 8 years ago

code looks fine, too

mrlajoie commented 8 years ago

Whenever I enter it, it doesn't run, just leads me to :

score.sentiment = function(sentences, pos.words, neg.words, .progress='none')

  • {
  • require(plyr)
  • require(stringr)

What is it that you are doing differently than me?

tevgeniou commented 8 years ago

all i did was copy all the code above (from score.sentiment = function... to return(scores.df) } ) in a file, then sourced that file (to load the function "score.sentiment"), then created the sentences and pos.words and neg.words above, and then just called the function using score.sentiment(sentences , pos.words, neg.words)

tevgeniou commented 8 years ago

generally check for some misplaced quote. I wonder why the code you copied became "grey" above. this happens when you use a quote (try it, github issues are like .Rmd, the quotes indicate code, so good to use them when you include code in the text in the issues.

mrlajoie commented 8 years ago

I just copied the code too. I am unsure where the problem lies. thanks for your help.

ayushivishwakarma30 commented 8 years ago

When I call this method analysis =score.sentiment(s.text, pos.words, neg.words) Then I got this Error in match(words, exc.words) : argument "exc.words" is missing, with no default and couldn't solve it. Please help