Closed baslat closed 2 years ago
It's ready for a proper review now. However, the readme re-rendered on my machine, so now all the download paths refer to my local machine. Do you have a github action to re-render the readme on merge or something similar?
Hey! I've been playing around with the API lately myself. I went a slightly different direction to you @baslat. I think our approaches may complement one another nicely. If you're interested, I can pop up a PR of my own for you (and @MattCowgill, great package by the way!) to take a look at
Here's a sample bit of code using the interface I wrote so you can get the flavour:
# List available flows
abs_dataflows()
#> # A tibble: 504 x 4
#> id name desc version
#> <chr> <chr> <chr> <chr>
#> 1 ABORIGINAL_POP_PROJ Projected populat~ Contains estimat~ 1.0.0
#> 2 ABORIGINAL_POP_PROJ_REMOTE Projected populat~ Contains estimat~ 1.0.0
#> 3 ABS_ABORIGINAL_POPPROJ_INDREGION Projected populat~ Contains estimat~ 1.0.0
#> 4 ABS_ACLD_LFSTATUS Australian Census~ The Australian C~ 1.0.0
#> 5 ABS_ACLD_TENURE Australian Census~ The Australian C~ 1.0.0
#> 6 ABS_ACLD_UNPAIDASST Australian Census~ The Australian C~ 1.0.0
#> 7 ABS_ACLD_VOLWORK Australian Census~ The Australian C~ 1.0.0
#> 8 ABS_ANNUAL_ERP_ASGS ERP by SA2 and ab~ Estimated Reside~ 1.0.0
#> 9 ABS_ANNUAL_ERP_ASGS2016 ERP by SA2 and ab~ Estimated Reside~ 1.0.0
#> 10 ABS_ANNUAL_ERP_LGA2016 ERP by LGA (ASGS ~ Estimated Reside~ 1.0.0
#> # ... with 494 more rows
# Get full data set for a given flow by providing id:
x <- abs_data("RES_DWELL")
tibble::as_tibble(x)
#> # A tibble: 4,536 x 9
#> MEASURE REGION FREQ TIME_PERIOD OBS_VALUE UNIT_MEASURE UNIT_MULT
#> <dbl+lbl> <chr+lbl> <chr+l> <chr> <dbl> <chr+lbl> <dbl+lbl>
#> 1 1 [Number o~ 3RQLD [Res~ Q [Qua~ 2003-Q3 17000 NUM [Number] 0 [Units]
#> 2 1 [Number o~ 3RQLD [Res~ Q [Qua~ 2003-Q4 15007 NUM [Number] 0 [Units]
#> 3 1 [Number o~ 3RQLD [Res~ Q [Qua~ 2004-Q1 14930 NUM [Number] 0 [Units]
#> 4 1 [Number o~ 3RQLD [Res~ Q [Qua~ 2004-Q2 13054 NUM [Number] 0 [Units]
#> 5 1 [Number o~ 3RQLD [Res~ Q [Qua~ 2004-Q3 13264 NUM [Number] 0 [Units]
#> 6 1 [Number o~ 3RQLD [Res~ Q [Qua~ 2004-Q4 13349 NUM [Number] 0 [Units]
#> 7 1 [Number o~ 3RQLD [Res~ Q [Qua~ 2005-Q1 13591 NUM [Number] 0 [Units]
#> 8 1 [Number o~ 3RQLD [Res~ Q [Qua~ 2005-Q2 12026 NUM [Number] 0 [Units]
#> 9 1 [Number o~ 3RQLD [Res~ Q [Qua~ 2005-Q3 12954 NUM [Number] 0 [Units]
#> 10 1 [Number o~ 3RQLD [Res~ Q [Qua~ 2005-Q4 12749 NUM [Number] 0 [Units]
#> # ... with 4,526 more rows, and 2 more variables: OBS_STATUS <chr+lbl>,
#> # OBS_COMMENT <lgl>
# Get filtered data using datakey:
y <- abs_data("ABS_C16_G49_SA", datakey = ".....0")
tibble::as_tibble(y)
#> # A tibble: 480 x 11
#> OCCP_C16 SEX_ABS QALLP_C16 STATE REGIONTYPE ASGS_2016 TIME_PERIOD
#> <chr+lbl> <dbl+lb> <chr+lbl> <dbl+l> <chr+lbl> <chr+lbl> <int>
#> 1 5 [Clerical ~ 3 [Pers~ TOT [Total] 0 [Aus~ AUS [Aust~ 0 [Austr~ 2016
#> 2 5 [Clerical ~ 3 [Pers~ 22 [Graduate~ 0 [Aus~ AUS [Aust~ 0 [Austr~ 2016
#> 3 8 [Labourers] 2 [Fema~ 40 [Advanced~ 0 [Aus~ AUS [Aust~ 0 [Austr~ 2016
#> 4 TOT [Total] 3 [Pers~ 40 [Advanced~ 0 [Aus~ AUS [Aust~ 0 [Austr~ 2016
#> 5 TOT [Total] 3 [Pers~ 0 [Level of ~ 0 [Aus~ AUS [Aust~ 0 [Austr~ 2016
#> 6 2 [Professio~ 2 [Fema~ TOT [Total] 0 [Aus~ AUS [Aust~ 0 [Austr~ 2016
#> 7 3 [Technicia~ 2 [Fema~ 50 [Certific~ 0 [Aus~ AUS [Aust~ 0 [Austr~ 2016
#> 8 8 [Labourers] 2 [Fema~ 0 [Level of ~ 0 [Aus~ AUS [Aust~ 0 [Austr~ 2016
#> 9 3 [Technicia~ 2 [Fema~ 10 [Postgrad~ 0 [Aus~ AUS [Aust~ 0 [Austr~ 2016
#> 10 4 [Community~ 1 [Male~ 0 [Level of ~ 0 [Aus~ AUS [Aust~ 0 [Austr~ 2016
#> # ... with 470 more rows, and 4 more variables: OBS_VALUE <int>,
#> # UNIT_MEASURE <chr+lbl>, OBS_STATUS <chr+lbl>, OBS_COMMENT <lgl>
# Get metadata (useful to figure out how to build a `datakey`)
z <- abs_datastructure("ABS_C16_G49_SA")
tibble::as_tibble(z)
#> # A tibble: 3,008 x 6
#> role var position desc code label
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 dimension OCCP_C16 1 Occupation 1 Managers
#> 2 dimension OCCP_C16 1 Occupation 2 Professionals
#> 3 dimension OCCP_C16 1 Occupation 3 Technicians and Trades Workers
#> 4 dimension OCCP_C16 1 Occupation 4 Community and Personal Service ~
#> 5 dimension OCCP_C16 1 Occupation 5 Clerical and Administrative Wor~
#> 6 dimension OCCP_C16 1 Occupation 6 Sales Workers
#> 7 dimension OCCP_C16 1 Occupation 7 Machinery Operators and Drivers
#> 8 dimension OCCP_C16 1 Occupation 8 Labourers
#> 9 dimension OCCP_C16 1 Occupation TOT Total
#> 10 dimension OCCP_C16 1 Occupation Z Inadequately described and Not ~
#> # ... with 2,998 more rows
Hi @kinto-b , thanks for sharing your code! I'm happy to work together on a combined solution if you like. If I understand your snippet correctly it looks like you can get the list of available API datasets, which is cool!
Thank you both, this is great! Sorry I haven't yet commented on your PR @baslat, I've been sick the last few days. Will review ASAP. Combining forces with @kinto-b seems sensible!
Stellar, I'll pop through a PR for you to browse and then we can decide on the best way to combine approaches
@baslat @kinto-b Sorry again for the delay - I haven't forgotten about this, various work + life things have just got in the way of a speedy review of this. I'll get to it ASAP. Thanks
@baslat @kinto-b Sorry again for the delay - I haven't forgotten about this, various work + life things have just got in the way of a speedy review of this. I'll get to it ASAP. Thanks
No worries @MattCowgill . I had a look at @kinto-b 's branch and I think it's probably a better candidate for merging, so I suggest we focus there.
Hi @baslat if @kinto-b's PR is the way forward (I'll take your work on that...) should we close this PR?
Yes please, I think it's a neater approach.
No worries. Thanks for all your work on this @baslat
Hi Matt, here is a PR to add functionality to read the ABS API. I've put it as draft as I'm still messing about for some edge cases, but thought you might like to start taking a look.
Since Annabel's work a year or so ago, the ABS has changed their API, meaning I've basically rewritten everything. The main function is
read_abs_api()
. It calls a few internal functions, the workhorse being unexportedtidy_api_data()
.Reading the API requires overcoming two main challenges:
tidy_api_data()
.chunk_query_url()
and documented examples.Happy to discuss!
I still need to:
read_abs_api()
~