allenpavlovich / uchicago-lnm-project

Repository for the Linear and Nonlinear Models course at the University of Chicago. This project develops a sentiment analysis tool using the Sentiment140 dataset, involving data preprocessing, EDA, model development, Explainable AI (XAI) methods, and causal inference techniques.
0 stars 0 forks source link

EDA for Sentiment140 Dataset #1

Open allenpavlovich opened 1 month ago

allenpavlovich commented 1 month ago

Description:

Perform comprehensive exploratory data analysis (EDA) on the Sentiment140 dataset to gain insights into the data and identify patterns. The following analyses should be included:

Class Distribution:

Text Length Analysis:

Word Cloud:

Sentiment Distribution by Length:

Common Words by Sentiment:

N-grams Analysis:

Sentiment Over Time:

User Analysis:

Hashtag Analysis:

-Create a bar chart to display the most common hashtags used in the tweets. -Identify popular topics or trends.

Sentiment by User:

allenpavlovich commented 1 month ago

git fetch origin git checkout experiment/eda-for-sentiment140-dataset