ropensci-archive / bomrang

:warning: ARCHIVED :warning: Australian government Bureau of Meteorology (BOM) data client for R
Other
109 stars 26 forks source link

S3 print method for get_historical #91

Closed jonocarroll closed 5 years ago

jonocarroll commented 6 years ago

(it was a long flight)

Initial pass at S3 headers for get_historical().

library(bomrang)

## data retrieved via get_historical now has a header
## exposing the parameters of the returned data

## this remains as a data.frame, but is printed with 
## print.data.table

bom1 <- get_historical(stationid = "023000", type = "max")
#> Data saved as /tmp/RtmpXd0aZu/IDCJAC0010_023000_1800_Data.csv
bom1
#>   ---- Australian Bureau of Meterorology (BOM) Data Resource ----
#>   (Original Request Parameters)
#>   Station:       ADELAIDE (WEST TERRACE / NGAYIRDAPIRA) [023000] 
#>   Location:      lat: -34.9257, lon: 138.5832
#>   Measurement / Origin:  Max / Historical
#>   Timespan:      1887-01-01 -- 2018-09-01 [93.5 years]
#>   ---------------------------------------------------------------  
#>        Product_code Station_number Year Month Day Max_temperature
#>     1:   IDCJAC0010          23000 1887     1   1              NA
#>     2:   IDCJAC0010          23000 1887     1   2              NA
#>     3:   IDCJAC0010          23000 1887     1   3              NA
#>     4:   IDCJAC0010          23000 1887     1   4              NA
#>     5:   IDCJAC0010          23000 1887     1   5              NA
#>    ---                                                           
#> 48107:   IDCJAC0010          23000 2018     9  17            20.1
#> 48108:   IDCJAC0010          23000 2018     9  18            16.6
#> 48109:   IDCJAC0010          23000 2018     9  19            14.8
#> 48110:   IDCJAC0010          23000 2018     9  20            15.9
#> 48111:   IDCJAC0010          23000 2018     9  21            19.7
#>        Accum_days_max Quality
#>     1:             NA        
#>     2:             NA        
#>     3:             NA        
#>     4:             NA        
#>     5:             NA        
#>    ---                       
#> 48107:              1       N
#> 48108:              1       N
#> 48109:              1       N
#> 48110:              1       N
#> 48111:              1       N

bom2 <- get_historical(latlon = c(-35.2809, 149.1300), type = "min")
#> Closest station: 070351 (CANBERRA AIRPORT)
#> Data saved as /tmp/RtmpXd0aZu/IDCJAC0011_070351_1800_Data.csv
bom2
#>   ---- Australian Bureau of Meterorology (BOM) Data Resource ----
#>   (Original Request Parameters)
#>   Station:       CANBERRA AIRPORT [070351] 
#>   Location:      lat: -35.3088, lon: 149.2004
#>   Measurement / Origin:  Min / Historical
#>   Timespan:      2008-09-01 -- 2018-09-01 [10.1 years]
#>   ---------------------------------------------------------------  
#>       Product_code Station_number Year Month Day Min_temperature
#>    1:   IDCJAC0011          70351 2008     1   1              NA
#>    2:   IDCJAC0011          70351 2008     1   2              NA
#>    3:   IDCJAC0011          70351 2008     1   3              NA
#>    4:   IDCJAC0011          70351 2008     1   4              NA
#>    5:   IDCJAC0011          70351 2008     1   5              NA
#>   ---                                                           
#> 3914:   IDCJAC0011          70351 2018     9  18            -2.3
#> 3915:   IDCJAC0011          70351 2018     9  19             4.1
#> 3916:   IDCJAC0011          70351 2018     9  20            -2.9
#> 3917:   IDCJAC0011          70351 2018     9  21            -2.7
#> 3918:   IDCJAC0011          70351 2018     9  22             0.2
#>       Accum_days_min Quality
#>    1:             NA        
#>    2:             NA        
#>    3:             NA        
#>    4:             NA        
#>    5:             NA        
#>   ---                       
#> 3914:              1       N
#> 3915:              1       N
#> 3916:              1       N
#> 3917:              1       N
#> 3918:              1       N

bom3 <- get_historical(stationid = "023000", type = "solar")
#> Data saved as /tmp/RtmpXd0aZu/IDCJAC0016_023000_1800_Data.csv
bom3
#>   ---- Australian Bureau of Meterorology (BOM) Data Resource ----
#>   (Original Request Parameters)
#>   Station:       ADELAIDE (WEST TERRACE / NGAYIRDAPIRA) [023000] 
#>   Location:      lat: -34.9257, lon: 138.5832
#>   Measurement / Origin:  Solar / Historical
#>   Timespan:      1990-01-01 -- 2018-09-01 [28.7 years]
#>   ---------------------------------------------------------------  
#>        Product_code Station_number Year Month Day Solar_exposure
#>     1:   IDCJAC0016          23000 1990     1   1           34.3
#>     2:   IDCJAC0016          23000 1990     1   2           30.6
#>     3:   IDCJAC0016          23000 1990     1   3             NA
#>     4:   IDCJAC0016          23000 1990     1   4           27.0
#>     5:   IDCJAC0016          23000 1990     1   5           31.1
#>    ---                                                          
#> 10488:   IDCJAC0016          23000 2018     9  18           13.2
#> 10489:   IDCJAC0016          23000 2018     9  19           17.3
#> 10490:   IDCJAC0016          23000 2018     9  20           12.4
#> 10491:   IDCJAC0016          23000 2018     9  21           17.1
#> 10492:   IDCJAC0016          23000 2018     9  22           20.9

## Passing through (some) dplyr functions does not destroy 
## this quality

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

filter(bom1, Month == 10)
#>   ---- Australian Bureau of Meterorology (BOM) Data Resource ----
#>   (Original Request Parameters)
#>   Station:       ADELAIDE (WEST TERRACE / NGAYIRDAPIRA) [023000] 
#>   Location:      lat: -34.9257, lon: 138.5832
#>   Measurement / Origin:  Max / Historical
#>   Timespan:      1887-01-01 -- 2018-09-01 [93.5 years]
#>   ---------------------------------------------------------------  
#>       Product_code Station_number Year Month Day Max_temperature
#>    1:   IDCJAC0010          23000 1887    10   1            17.6
#>    2:   IDCJAC0010          23000 1887    10   2            14.6
#>    3:   IDCJAC0010          23000 1887    10   3            16.7
#>    4:   IDCJAC0010          23000 1887    10   4            18.0
#>    5:   IDCJAC0010          23000 1887    10   5            15.1
#>   ---                                                           
#> 4057:   IDCJAC0010          23000 2017    10  27            33.7
#> 4058:   IDCJAC0010          23000 2017    10  28            21.7
#> 4059:   IDCJAC0010          23000 2017    10  29            28.1
#> 4060:   IDCJAC0010          23000 2017    10  30            16.8
#> 4061:   IDCJAC0010          23000 2017    10  31            19.5
#>       Accum_days_max Quality
#>    1:              1       Y
#>    2:              1       Y
#>    3:              1       Y
#>    4:              1       Y
#>    5:              1       Y
#>   ---                       
#> 4057:              1       Y
#> 4058:              1       Y
#> 4059:              1       Y
#> 4060:              1       Y
#> 4061:              1       Y

select(bom2, Year:Day)
#>   ---- Australian Bureau of Meterorology (BOM) Data Resource ----
#>   (Original Request Parameters)
#>   Station:       CANBERRA AIRPORT [070351] 
#>   Location:      lat: -35.3088, lon: 149.2004
#>   Measurement / Origin:  Min / Historical
#>   Timespan:      2008-09-01 -- 2018-09-01 [10.1 years]
#>   ---------------------------------------------------------------  
#>       Year Month Day
#>    1: 2008     1   1
#>    2: 2008     1   2
#>    3: 2008     1   3
#>    4: 2008     1   4
#>    5: 2008     1   5
#>   ---               
#> 3914: 2018     9  18
#> 3915: 2018     9  19
#> 3916: 2018     9  20
#> 3917: 2018     9  21
#> 3918: 2018     9  22

mutate(bom3, Date = as.Date(paste(Year, Month, Day, sep = "-")))
#>   ---- Australian Bureau of Meterorology (BOM) Data Resource ----
#>   (Original Request Parameters)
#>   Station:       ADELAIDE (WEST TERRACE / NGAYIRDAPIRA) [023000] 
#>   Location:      lat: -34.9257, lon: 138.5832
#>   Measurement / Origin:  Solar / Historical
#>   Timespan:      1990-01-01 -- 2018-09-01 [28.7 years]
#>   ---------------------------------------------------------------  
#>        Product_code Station_number Year Month Day Solar_exposure
#>     1:   IDCJAC0016          23000 1990     1   1           34.3
#>     2:   IDCJAC0016          23000 1990     1   2           30.6
#>     3:   IDCJAC0016          23000 1990     1   3             NA
#>     4:   IDCJAC0016          23000 1990     1   4           27.0
#>     5:   IDCJAC0016          23000 1990     1   5           31.1
#>    ---                                                          
#> 10488:   IDCJAC0016          23000 2018     9  18           13.2
#> 10489:   IDCJAC0016          23000 2018     9  19           17.3
#> 10490:   IDCJAC0016          23000 2018     9  20           12.4
#> 10491:   IDCJAC0016          23000 2018     9  21           17.1
#> 10492:   IDCJAC0016          23000 2018     9  22           20.9
#>              Date
#>     1: 1990-01-01
#>     2: 1990-01-02
#>     3: 1990-01-03
#>     4: 1990-01-04
#>     5: 1990-01-05
#>    ---           
#> 10488: 2018-09-18
#> 10489: 2018-09-19
#> 10490: 2018-09-20
#> 10491: 2018-09-21
#> 10492: 2018-09-22

arrange(bom1, desc(Max_temperature))
#>   ---- Australian Bureau of Meterorology (BOM) Data Resource ----
#>   (Original Request Parameters)
#>   Station:       ADELAIDE (WEST TERRACE / NGAYIRDAPIRA) [023000] 
#>   Location:      lat: -34.9257, lon: 138.5832
#>   Measurement / Origin:  Max / Historical
#>   Timespan:      1887-01-01 -- 2018-09-01 [93.5 years]
#>   ---------------------------------------------------------------  
#>        Product_code Station_number Year Month Day Max_temperature
#>     1:   IDCJAC0010          23000 1939     1  12            46.1
#>     2:   IDCJAC0010          23000 1939     1  10            45.9
#>     3:   IDCJAC0010          23000 1904    12  31            44.2
#>     4:   IDCJAC0010          23000 1939     1  13            44.2
#>     5:   IDCJAC0010          23000 1939     1   9            44.0
#>    ---                                                           
#> 48107:   IDCJAC0010          23000 2017     6  15              NA
#> 48108:   IDCJAC0010          23000 2018     3  13              NA
#> 48109:   IDCJAC0010          23000 2018     5  11              NA
#> 48110:   IDCJAC0010          23000 2018     6   7              NA
#> 48111:   IDCJAC0010          23000 2018     7   2              NA
#>        Accum_days_max Quality
#>     1:              1       Y
#>     2:              1       Y
#>     3:              1       Y
#>     4:              1       Y
#>     5:              1       Y
#>    ---                       
#> 48107:             NA        
#> 48108:             NA        
#> 48109:             NA        
#> 48110:             NA        
#> 48111:             NA

## and grouping operations are explicitly supported/groups shown

group_by(bom2, Year)
#>   ---- Australian Bureau of Meterorology (BOM) Data Resource ----
#>   (Original Request Parameters)
#>   Station:       CANBERRA AIRPORT [070351] 
#>   Location:      lat: -35.3088, lon: 149.2004
#>   Measurement / Origin:  Min / Historical
#>   Timespan:      2008-09-01 -- 2018-09-01 [10.1 years]
#>   Groups:        Year [11]
#>   ---------------------------------------------------------------  
#>       Product_code Station_number Year Month Day Min_temperature
#>    1:   IDCJAC0011          70351 2008     1   1              NA
#>    2:   IDCJAC0011          70351 2008     1   2              NA
#>    3:   IDCJAC0011          70351 2008     1   3              NA
#>    4:   IDCJAC0011          70351 2008     1   4              NA
#>    5:   IDCJAC0011          70351 2008     1   5              NA
#>   ---                                                           
#> 3914:   IDCJAC0011          70351 2018     9  18            -2.3
#> 3915:   IDCJAC0011          70351 2018     9  19             4.1
#> 3916:   IDCJAC0011          70351 2018     9  20            -2.9
#> 3917:   IDCJAC0011          70351 2018     9  21            -2.7
#> 3918:   IDCJAC0011          70351 2018     9  22             0.2
#>       Accum_days_min Quality
#>    1:             NA        
#>    2:             NA        
#>    3:             NA        
#>    4:             NA        
#>    5:             NA        
#>   ---                       
#> 3914:              1       N
#> 3915:              1       N
#> 3916:              1       N
#> 3917:              1       N
#> 3918:              1       N

Created on 2018-09-23 by the reprex package (v0.2.0).

It looks a bit better in a terminal/RStudio, rather than GitHub's fixed-width window.

bomrang_group_by

bomrang_terminal

Thoughts? I've left it flexible enough that the bomrang_tbl class could be attached to results from other queries and processed similarly.

jonocarroll commented 6 years ago

One option would be to also have a header = TRUE option (default on).

jonocarroll commented 6 years ago

(now with even more correct spelling)

adamhsparks commented 6 years ago

Nice! I'm polishing the nasapower package right now, will get to this PR soon.

adamhsparks commented 6 years ago

Seeing these errors when checking:

Undocumented code objects:
  ‘arrange.bomrang_tbl’ ‘filter.bomrang_tbl’ ‘group_by.bomrang_tbl’
  ‘mutate.bomrang_tbl’ ‘rename.bomrang_tbl’ ‘select.bomrang_tbl’
  ‘slice.bomrang_tbl’
All user-level objects in a package should have documentation entries.

@jonocarroll, can you address them?

jonocarroll commented 6 years ago

Yeah - I was planning to inherit the dplyr documentation as these are merely attribute-preserving wrappers.

If you're comfortable with my implementation I'll start adding checks/tests.

On Wed, 26 Sep. 2018, 9:01 pm Adam H. Sparks, notifications@github.com wrote:

Seeing these errors when checking:

Undocumented code objects: ‘arrange.bomrang_tbl’ ‘filter.bomrang_tbl’ ‘group_by.bomrang_tbl’ ‘mutate.bomrang_tbl’ ‘rename.bomrang_tbl’ ‘select.bomrang_tbl’ ‘slice.bomrang_tbl’ All user-level objects in a package should have documentation entries.

@jonocarroll https://github.com/jonocarroll, can you address them?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/bomrang/pull/91#issuecomment-424681744, or mute the thread https://github.com/notifications/unsubscribe-auth/AJDpIXNQgKmaddQAGQ22V4szym4FkGKhks5ue2WBgaJpZM4W1e_D .

adamhsparks commented 6 years ago

Related to my last comment about data.table vs a data.frame. Looking at this again, is it worth just using the bomrang_tbl across the package for consistency?

Maybe we should employ the same print methods for all of the returns in bomrang? Looking at the XML files for the forecasts there are metadata in them that we're currently ignoring, e.g.,

<identifier>IDN11001</identifier>
<issue-time-utc>2018-10-04T18:10:00Z</issue-time-utc>
<issue-time-local tz="EST">2018-10-05T04:10:00+10:00</issue-time-local><sent-time>2018-10-04T18:10:08Z</sent-time>
<expiry-time>2018-10-05T18:10:00Z</expiry-time>
<validity-bgn-time-local tz="EST">2018-10-05T00:00:00+10:00</validity-bgn-time-local>
<validity-end-time-local tz="EDT">2018-10-07T23:59:59+11:00</validity-end-time-local>
<next-routine-issue-time-utc>2018-10-05T06:05:00Z</next-routine-issue-time-utc>
<next-routine-issue-time-local tz="EST">2018-10-05T16:05:00+10:00</next-routine-issue-time-local>

Could be nice to include this information in headers and have a uniform print method for all items returned by bomrang.

adamhsparks commented 6 years ago

@jonocarroll, I'm good with this if you want to polish it and get it ready for me to accept.

I think we'll use a custom print function for all the data frames in bomrang so that it's uniform across functions and data returned. Some might benefit from the metadata header, others won't but the data.table print method is nice.

I've made some other changes to the package, fixing documentation, making sure it complies with CRAN policies and handling data downloads more gracefully. I will work on https://github.com/ropensci/bomrang/issues/88 shortly to get an image in when an error occurs so we can get a new release out with these fixes.

jonocarroll commented 6 years ago

You're in luck - I have another long flight coming up next week 🙃

My plan was to use this function as a test of how the classed print might look then roll it out to the other objects in the package. I'll clean up what I have and continue developing as best as I can. I did get stuck re-exporting the dplyr verbs... well, all except filter are fine being classed. I'll keep at it.

I think I've got tests in place and a few other things I haven't yet pushed because I broke filter, but I'll push these too. If you don't mind waiting a bit, this could be really very tidy.

adamhsparks commented 6 years ago

I don't mind waiting a bit. It'll take me a while to get the image stuff squared away.

jonocarroll commented 5 years ago

Right, I think this works okay now. I can't seem to test it from the airport lounge so apologies if it's actually broken. Take it for a spin and see what you think. I can then see what I need to do to generalise it to another response (please suggest which to try).

adamhsparks commented 5 years ago

Functions nicely when I tested it yesterday.

Regarding implementing with other BOM data. I was looking at the XML files for the ag bulletins, e.g. ftp://ftp2.bom.gov.au/anon/gen/fwo/IDN65176.xml. We could include some of the information from the beginning of the XML file that suggests looking at the information at the bottom of this page: http://www.bom.gov.au/cgi-bin/wrap_fwo.pl?IDN65176.html in the header when printed in the console.

Otherwise, just using a standard print method for all data returned, even if no additional information is passed along in the header would be good.

jonocarroll commented 5 years ago

I've expanded the class to work with get_current_weather also, so you can see what it involves. So far not much apart from having the station info defined. Let me know what you think of this and I'll keep iterating.

screenshot from 2019-01-14 23-11-07

jonocarroll commented 5 years ago

The failing tests are unrelated. I brought this branch up to date with develop and all my functions pass.

adamhsparks commented 5 years ago

I've updated the internal databases and fixed tests accordingly for the updated data.

However, line 303 of get_current_weather(), count = format(diff(range(x$local_date_time_full)), digits = 3) errors, I can't retrieve current weather due to x not being found, Error in diff(range(x$local_date_time_full)) : object 'x' not found.

I do like what I'm seeing though! Fantastic work!

jonocarroll commented 5 years ago

My bad - had a test value in (thought I tested but apparently not). I also repaired some of the other tests for get_current_weather which would have failed too. I think it's working for those two now.

jonocarroll commented 5 years ago

FYI I've bypassed the 'raw' output for get_current_weather which may not be ideal. Do you still want that option? I'm also using data.table to output by default, so do you want to keep that option?

adamhsparks commented 5 years ago

Yeah, not ideal to bypass the raw parameter. I think we need to make sure all the columns are returned as the proper class in all functions.

We can just keep data.table outputs for everything, I think. I/we just need to check and update documentation where appropriate.