clarencejychan / nephew-pipeline

Data pipeline for proof of concept /r/nba player brand analysis
0 stars 0 forks source link

Investigate methodology in order to create a model for sentiment analysis #5

Open clarencejychan opened 4 years ago

clarencejychan commented 4 years ago

Most likely need to follow the VADER project found here. http://comp.social.gatech.edu/papers/icwsm14.vader.hutto.pdf

A good idea to read through (or at the very least skim through) to understand what a good methodology is to understand sentiment analysis.

AC: A rough document describing what steps we need to do in order to create a decent model for our purposes.

clarencejychan commented 4 years ago

This might be a good place to start the research @PanTheMan. Please note your findings in the document and be able to plan out what steps we need to take in order to improve our model.

PanTheMan commented 4 years ago

Vader Reading:

Another option instead of vader is SenticNet

BLAH BLAH BLAH talks about it measuring its dick to other known lexicons or sentiment score calculators and also vs humans

Seems like our best bet is to use Vader for now. It's simple, requires no dedicated CPU to train an actual model.

An improvement from looking online though: Take any comment and tokenize into sentences and perform an average calculation. I would say it gets funky when it tries to look at multiple sentences at the same time.