britth / GameChanger

A simple command line program that processes Twitter data to discover popular moments in multi-game sporting events. Never settle for a boring game!
MIT License
0 stars 0 forks source link

figuring out average tweet rate #2

Closed sunhwap closed 10 years ago

sunhwap commented 10 years ago

This issue has to do with achieving our third milestone: The user can see moments where the activity(tweet volume) significantly above average. This would requires us to figure out how to compute average tweet rate.

sunhwap commented 10 years ago

Since the largest time frame is one day, and we want to know average tweet rate at every 30 second, we should take them into account in determining average tweet rate. Let 30 sec=1 interval Then, Average tweet rate =(total # of tweets/ 1day)(1day/24hours)(1hour/60minutes)(1minute/60sec)*(30sec/1interval)

sunhwap commented 10 years ago

We can define a function called tweet_average and define another function to compare whether the tweet rate for a particular school is above average or not.

britth commented 10 years ago

I've seen a couple of research articles that might inform this task as well. They specifically deal with temporal summarization using Twitter posts, but they look at moments of high activity (i.e. 'spikes' or 'bursts' of activity) as a way to detect when something has happened. I imagine that detecting moments of above average activity for this task could be very similar. Take a look:

Nichols, J., Mahmud, J., & Drews, C. (2012). Summarizing sporting events using twitter. In Proceedings of the 2012 ACM international conference on Intelligent User Interfaces (pp. 189–198). New York, NY, USA: ACM. doi:10.1145/2166966.2166999 (http://tre.docdat.com/tw_files2/urls_41/40/d-39281/7z-docs/7.pdf)

Chakrabarti, D., & Punera, K. (2011). Event Summarization Using Tweets. In ICWSM. Retrieved from http://www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/viewPDFInterstitial/2885/3263

These seemed particularly helpful, but there are a few others I've seen that might be useful as well; if I can figure out how to share a Zotero collection, I'll add those too!

sunhwap commented 10 years ago

Yes, They are useful. Thank you for finding them.

britth commented 10 years ago

@sunhwap In addition to your work on calculating avg, I've also been looking at cumulative moving average as a way to do a moving average tweet rate over the period. It seems like a really easy way to determine the average rate of tweet frequency updates, in order to note the 'game changing' moments; I took the formula from here: http://en.wikipedia.org/wiki/Moving_average#Cumulative_moving_average, and kind of used some of the information from that Nichols et al. (2012) article to adapt it a bit.

Currently working on this in the main gamechanger.py file, so I'll push that up soon. Eventually, we can see if we want/need to keep both ways of calculating average or if there's a good reason to use both.

britth commented 10 years ago

Pretty much done via #40 and #41 I'll go ahead and close, reopen if needed

sunhwap commented 10 years ago

Good job!