AU-BURGr / UnConf2017

Repository for Unconf Topics 2017
7 stars 2 forks source link

ausmacrodata R package for Australian Macroeconomic Data #1

Open robjhyndman opened 7 years ago

robjhyndman commented 7 years ago

Recently, I helped set up a new website for scraping Australian macroeconomic data from the ABS and RBA: ausmacrodata.org.

It would be great to have an R package to pull in data from it. Something like the dataseries package which pulls in data from dataseries.org (an analogous site for Swiss data). Another similar package is BETS for Brazilian economic time series.

So I propose we build the ausmacrodata R package. That should be easily achievable in a couple of days, including a vignette and a CRAN submission.

jonocarroll commented 7 years ago

+1

Cool. Does that site have an API, or can the one it uses be leveraged?

I proposed something slightly more broad last year (https://github.com/ropensci/auunconf/issues/16) and would be keen to work on something like this. data.gov.au has a working API and presumably some of these datasets can be extracted that way. A vetted list of datasets (as per ausmacrodata.org) could be a tractable goal.

By the looks of it, the ABS links go to their own ABS.Stat pages which isn't the most helpful. They seem to have a framework set up though, so a project could wrap that up. There's an update schedule for data http://www.abs.gov.au/websitedbs/D3310114.nsf/home/absstat+Release+calendar so we could try to keep things up to date.

robjhyndman commented 7 years ago

The ausmacrodata series all have unique ID, and they all have a csv file in a consistent format (first column date, second column value). So it is easy to write a small function to return a ts object given an ID. It only returns time series, and only quarterly, monthly or annual, so it is a fairly trivial exercise.

The main work is done in the back end of ausmacrodata, which converts all the ABS/RBF data into a consistent format. There is a daily cron job which scans for updates, so ausmacrodata is always within a day of being up-to-date. As you probably know, the ABS and RBF use xls files (ugh) which are hopelessly inconsistent, sometimes changing between releases for the same series. So working with their data directly is pretty difficult -- hence ausmacrodata.

We could expand the scope somewhat by including (some of?) data.gov.au as well.

jonocarroll commented 7 years ago

Emphasis on the "some of" -- I've seen some trainwreck data sets on there. Hence the notion of selected, curated sets is great.