Calculate daily epu index

Datafable / epu-index

EPU index

http://www.applieddatamining.com/cms/?q=content/economic-policy-uncertainty-index

1 stars 0 forks source link

Calculate daily epu index #61

Closed bartaelterman closed 9 years ago

bartaelterman commented 9 years ago

Start a job after running the scrapers that will calculate and persist the epu index of yesterday.

bartaelterman commented 9 years ago

@niconoe is it ok if I forward this to you to make it a Django command? I think we will schedule the scraper commands with cron, and when all of them are finished, this job should run.

niconoe commented 9 years ago

Yep!

For more flexibility and being able to deal with cron misconfiguration issues, shouldn't this command allow:

either to run for a specific day given as a parameter
either for the "yesterday" special option (used by default)
either reprocess everything?

bartaelterman commented 9 years ago

Indeed. However, "everything" will be difficult. Since for days before 2014, we don't know the number of journals scraped. (In fact, maybe we need a place to store this for new data too?)

bartaelterman commented 9 years ago

@niconoe apparently the cutoff for the epu index is not 0 but -0.15.

So the EPU index is the number of articles with a epu score higher than -0.15 divided by the number of journals scraped.

So again, maybe we need a place where we can store the number of journals scraped. Some place where every scraper can write "I succeeded for this day". Maybe a table "journals scraped" with two columns "date" and "spider/journal name"?

bartaelterman commented 9 years ago

See #64

niconoe commented 9 years ago

It's now implemented, use as:

$ python manage.py calculate_daily_epu 2015-08-17

It tells what it does on stdout and store its result in EpuIndexScore. Please review and test!

bartaelterman commented 9 years ago

Tested. Works perfectly.