MattCowgill / readabs

Download and tidy time series data from the Australian Bureau of Statistics in R
https://mattcowgill.github.io/readabs/
Other
101 stars 22 forks source link

ABS TSD returning unexpected additional tables #174

Closed MattCowgill closed 2 years ago

MattCowgill commented 2 years ago
library(dplyr)
library(readabs)

When requesting table 1 from ABS 5368.0, tables 12a and 13a are also returned:

read_abs("5368.0", "1") %>%
  pull(table_title) %>%
  unique()
#> Finding URLs for tables corresponding to ABS catalogue 5368.0
#> Attempting to download files from catalogue 5368.0, International Trade in Goods and Services, Australia
#> Downloading https://www.abs.gov.au/statistics/economy/international-trade/international-trade-goods-and-services-australia/latest-release/536801.xls
#> Downloading https://www.abs.gov.au/statistics/economy/international-trade/international-trade-goods-and-services-australia/latest-release/5368012a.xls
#> Downloading https://www.abs.gov.au/statistics/economy/international-trade/international-trade-goods-and-services-australia/latest-release/5368013a.xls
#> Extracting data from downloaded spreadsheets
#> Tidying data from imported ABS spreadsheets
#> [1] "TABLE 1. GOODS AND SERVICES, Summary: Seasonally adjusted and trend estimates, Current prices"             
#> [2] "TABLE 12a. MERCHANDISE EXPORTS, Standard International Trade Classification (1 and 2 digit), FOB Value"    
#> [3] "TABLE 13a. MERCHANDISE IMPORTS, Standard International Trade Classification (1 and 2 digit), Customs Value"

This is because these tables are also returned when querying the ABS Time Series Directory looking for table 1: https://ausstats.abs.gov.au/servlet/TSSearchServlet?catno=5368.0&ttitle=1

The same problem does not arise in other series, eg. Labour Force, where requesting table 1 returns only that table, not multiple tables (12, 16, etc.):

read_abs("6202.0", "1") %>%
  pull(table_title) %>%
  unique()
#> Finding URLs for tables corresponding to ABS catalogue 6202.0
#> Attempting to download files from catalogue 6202.0, Labour Force, Australia
#> Downloading https://www.abs.gov.au/statistics/labour/employment-and-unemployment/labour-force-australia/latest-release/6202001.xls
#> Extracting data from downloaded spreadsheets
#> Tidying data from imported ABS spreadsheets
#> [1] "Table 1. Labour force status by Sex, Australia - Trend, Seasonally adjusted and Original"

# The issue arises because of the phrase "1 and 2 digit" in some table titles
# in 5368.0 table titles

Created on 2021-11-05 by the reprex package (v2.0.1)