Closed vcuspinera closed 3 years ago
I think this issues way of doing things is lovely btw. Much better than what I did. Lol, soz
as I said, Checkout the end of eda-Copy1, I did some analysis.
@cuspime I will look into the eda-Copy1
; you can also update the original eda
file with the time series analysis.
I just finished the changes on the Heatmap following your recommendation of Normalizing the Moving Average of tweets using min-max values of each twitter account, as mentioned in the issue #4
Just add to the Sentiment Analysis Jupyter Notebook the polarity and subjectivity plots by Canadian Government Twitter account, as mentioned in issue #6.
I looked at the eda-Copy1
and I like a lot the Time Series analysis, it really add up to the project Leo @cuspime.
I realize that you add this work to the Exploratory Data Analysis (EDA), but as you are using the polarity and the final dataset tweets_db_sentiment.json
, I think this work should be added in the main Sentiment Analysis of the Repo.
So, I will add the analysis to the sentiment_analysis.ipynb file. Also we should delete the eda-Copy1
.
Lovely! I'm happy you liked it. Maybe we can also do something similar with the amount of tweets? I mean I don't think it would be as central as an analysis but it might...
Cleaning ads: We can spot some ads by running
pd.set_option('display.max_colwidth', None)
df_tot[df_tot[['account', 'lang', 'sourceLabel','replyCount', 'retweetCount', 'likeCount',
'username', 'tweet', 'polarity',
'subjectivity']].duplicated()][['account', 'lang', 'replyCount', 'retweetCount', 'likeCount','username', 'tweet', 'polarity',
'subjectivity']].sort_values(by=[ 'retweetCount', 'likeCount','replyCount'], ascending=False)
Let's check the hashtags on the 10th and on the 25th
In the next table I am sharing the ranking of Twitter Trends in Canada, on March 10th and 25th of 2020, at 12:00 and 18:00 hr.
This information was retrieved from the getdaytrends webpage.
Ranking | Mar-10, 12:00 | Mar-10, 18:00 | Mar-25, 12:00 | Mar-25, 18:00 |
---|---|---|---|---|
1 | #ThankYouNamjoon | #MAR10Day | Prince Charles | CERB |
2 | #TuesdayMotivation | #BernieSurge | #Mixtape_OnTrack | Quarantine Act |
3 | #MAR10Day | #TuesdayThoughts | #Mixtape_바보라도알아 | Prince Charles |
4 | Chuck Norris | Warzone | #WednesdayMotivation | #IKnewIHadCabinFeverWhen |
5 | #oriandthewillofthewisps | Ollie | #FreeJamesWoods | #WednesdayMotivation |
6 | #CoreyFeldman | anne marie | #PrayTogether | kamal |
7 | italy | AR-14 | easter | Anime North |
8 | Hannah Ann | #newcitiesglobalgoals | Liberals | #startupchats |
9 | Gobert | #MattGaetzIsAQuarantinedTool | parliament | #boycottTimHortons |
10 | Sydow | Chuck Norris | Scott Reid | Bill C-13 |
11 | Happy Holi | Ricky | 1 in 5 Canadians | #ButterflyXIUDay |
12 | CPAC | Muskrat Falls | Conservatives | Royal Assent |
13 | Norm | Morgan Rielly | LCBO | Connor Bedard |
14 | Peter | Horizon Zero Dawn | Swizz | Stuart Gordon |
15 | Pearl Jam | Bob Rae | Tory Lanez | Canada Goose |
16 | mccaw | Ivy League | Dave Stieb | Our Father |
17 | Strange Brew | NFL 2K | Pierre | Bob and Doug McKenzie |
18 | Utah | Marleau | Ann Coulter | Alex Ottley |
19 | Gamestop | Charlie Sheen | my 2,400 | senat |
20 | Ibaka | 2nd Amendment | Dick Pound | Lesley |
21 | #HowNotToGetArrested | Éric Salvail | Olympics | Sikhs |
22 | #PearlJam | hedman | Bloc | Communes |
23 | #TheBachelor | Schlatter | Albert Uderzo | Romanov |
24 | #COVID2019 | wheel of fortune | timbo | Our Government |
25 | #InternationalWomensDay | Madden | #CoronavirusLockdown | Princess Diana |
⚠️ After some discussions we realize that the first announcement made by the Prime Minister of Canada was done on March 11th, which make sense. From this point, we download tweets from February 1st until April 30th of 2020, change the announcement date to March 11th, and re-run every analysis we made before
I assume that Polarity and Subjectivity equals to zero is related with tweets that use words that probably are not in the TextBlob dictionary, or have words that are not subjective nor positive or negative polarity, so it assigns values as zero. Some of these tweets as examples:
We finished the pending tasks and will versioning the project as closing.
TODO
"@"
in notebooks and scripts at src folder, and re-run these files.