f4bD3v / humanitas

A price prediction toolset for developing countries
BSD 3-Clause "New" or "Revised" License
17 stars 7 forks source link

Download tweets #6

Closed mstefanro closed 10 years ago

duynguyen commented 10 years ago

I think this is a crucial task and in high priority. The only problem is to get the server and database ready, which is dependent on infrastructure providence from prof & TA. I can take care of this implementation (config + scheduling + executing) once we have the things in hands.

mstefanro commented 10 years ago

You should focus on other tasks for now, storing tweets in database is not really of interest. We don't seem to need any features that a flat file doesn't provide, for now.

On 04/06/2014 11:27 PM, Duy Nguyen wrote:

I think this is a crucial task and in high priority. The only problem is to get the server and database ready, which is dependent on infrastructure providence from prof & TA. I can take care of this implementation (config + scheduling + executing) once we have the things in hands.

— Reply to this email directly or view it on GitHub https://github.com/fabbrix/humanitas/issues/6#issuecomment-39682092.

f4bD3v commented 10 years ago

We have to focus on two different categories of tasks here: 1) Getting and storing a large percentage of indian twitter users and then all their tweets that match keywords + continue to query them 2) Filtering out tweets from up to 50GB tarballs monthly tweets on https://archive.org/details/twitterstream