ropensci-archive / bomrang

:warning: ARCHIVED :warning: Australian government Bureau of Meteorology (BOM) data client for R
Other
109 stars 26 forks source link

get_historical() station ids don't align with other outputs #127

Closed adamhsparks closed 3 years ago

adamhsparks commented 3 years ago

I've used sweep_for_stations() to identify weather stations for which I wanted historical data and get_historical() to download the stations. However, the type of data is not standard for the station id values. sweep_for_stations() returns a character value with BOM's leading 0s, get_historical() returns a numeric value that strips the leading 0s, which is inconsistent across the package functions and makes it difficult to join tables where you may have the ID as a character value and numeric elsewhere.

@jonocarroll, was this your intended functionality? I'd prefer that the functions that share returned values follow a consistent approach if at all possible and @HughParsonage's handling of the station IDs makes more sense to me initially since the leading "0" and "00" have meaning, but I'm open to discussion.

jonocarroll commented 3 years ago

That's most likely an oversight on my part - get_historical() should absolutely keep things in the package-standard format. From what I can see, the returned stationid should be unchanged from input as it's output here https://github.com/ropensci/bomrang/blob/master/R/get_historical_weather.R#L124 which is essentially

sweep_for_stations(latlon = latlon)[1, , drop = TRUE]$site

The only place I see it converted is in the comparison here https://github.com/ropensci/bomrang/blob/master/R/get_historical_weather.R#L147

I'm not at a computer I can test with at the minute, but which output has the numeric value? Is that in the station attribute or the ncc_list$site attribute? The latter comes from e.g. http://www.bom.gov.au/climate/data/lists_by_element/alphaAUS_136.txt which are integer without padding, and that's my best guess at what's going on.

adamhsparks commented 3 years ago

It's the station_number col from get_historical()

The band-aid solution for the moment is this for my work.

  mutate(station_number = case_when(
    nchar(station_number) == 4 ~ paste0("00", station_number),
    TRUE ~ paste0("0", station_number)
  ))
adamhsparks commented 3 years ago

Thanks for the quick fix. Looks good to me.