TromboneDavies / PolarOps

0 stars 0 forks source link

Do some basic EDA of polar/non-polar training data #42

Closed divilian closed 3 years ago

divilian commented 3 years ago

Alexis: create two bar plots -- one for polarized and one for non-polarized -- showing the frequency of the most common (say) 20 non-stop-words for each group. Veronica: create a faceted histogram, and faceted KDE, of the lexical diversity of polarized vs. non-polarized threads.

divilian commented 3 years ago

Nice job, ladies: looks like lexical diversity isn't going to be a big win for us, and that the most common words more or less line up with intuition (and this exercise helped us find and eliminate the \n's!)