eco4cast / neon4cast

A helper R package for the neon4cast challenge
Other
7 stars 7 forks source link

draft noaa_stage1/2/3 functions #10

Closed cboettig closed 1 year ago

cboettig commented 1 year ago

First draft of helper functions (& documentation) to access stage 1/2/3 products. Currently only accesses parquet-based versions.

The functions do not collect() by default, but try and offer some helpful messaging by default since creating a connection can be slow. I've also tried to add some advice about using dplyr to filter and summarize data before calling collect() to import data, not sure if that advice will be helpful or either too cryptic or too obvious for beginning/experienced users.

I have more docs for stage 1 than the other two, help flushing out the docs there would be great. I'll try and add unit tests.

stage 1 & 2 default to filtering cycle = "00" since I think users can get especially confused by getting multiple cycles in the same data frame. But other than that one case, I've tried to leave filtering up to the documentation & examples, since I think it's generally most powerful if users can have the remote object and the option to do any dplyr filter/summarise steps directly themselves.

All functions should handle the logistics of setting and restoring the user's env vars gracefully.

Here's an example run with messaging:

library(neon4cast)
weather <- noaa_stage1()
#> establishing connection to stage1 at data.ecoforecast.org ...
#> connected! Use dplyr functions to filter and summarise.
#> Then, use collect() to read result into R
# 5.7M rows of data:
 weather |> 
   dplyr::filter(start_date == "2022-04-01") |>
   dplyr::collect()
#> # A tibble: 5,786,316 × 13
#>    site_id predicted variable height        horizon ensemble start_time         
#>    <chr>       <dbl> <chr>    <chr>           <dbl>    <int> <dttm>             
#>  1 LIRO    95463.    PRES     surface             0        1 2022-04-01 18:00:00
#>  2 LIRO        0.625 TMP      2 m above gr…       0        1 2022-04-01 18:00:00
#>  3 LIRO       76.7   RH       2 m above gr…       0        1 2022-04-01 18:00:00
rqthomas commented 1 year ago

I updated the descriptions of stage 2 and stage3. We now need to update the neon4cast-doc documentation.