ropensci / tidyhydat

An R package to import Water Survey of Canada hydrometric data and make it tidy
https://docs.ropensci.org/tidyhydat
Apache License 2.0
70 stars 19 forks source link

Recent data not in database #195

Closed alex-koiter closed 1 year ago

alex-koiter commented 1 year ago

Describe the bug Updated the database (published on 2023-01-15) but 2021 data for the stations I am interested in is not available despite being available online and can be manually downloaded

To Reproduce Steps to reproduce the behavior: max(hy_daily_flows(station_number = "05OF023")$Date)

https://wateroffice.ec.gc.ca/report/historical_e.html?stn=05OF023

Expected behavior Downloaded database should include 2021 data

Desktop (please complete the following information): R version 4.2.1 (2022-06-23) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 22.04.2 LTS tidyhydat_0.5.9

boshek commented 1 year ago

:wave: @alex-koiter

Have a look here: https://github.com/ropensci/tidyhydat/issues/193#issuecomment-1430749266

I am almost finished this and ready to merge. You should be able to get at least some of the data you want. Plus you can be a beta tester for me :)

alex-koiter commented 1 year ago

I am not sure how https://github.com/ropensci/tidyhydat/issues/193#issuecomment-1430749266 relates to my issue. I did update 'tidyhydat' but it made no difference, I wonder if the issue is that the 2021 flow data for station "05OF023" is not in the database. Perhaps this is a Water Survey of Canada issue and not a 'tidyhydat' issue.

boshek commented 1 year ago

So my hope was that the webservice could stretch back further than it does. So eg:

library(tidyhydat)

hy_stations("05OF023")
#>   Queried from version of HYDAT released on 2022-10-24
#>    Observations:                      1
#>    Jurisdictions: MB
#>    Station(s) returned:               1
#>    Stations requested but not returned: 
#>     All stations returned.
#> # A tibble: 1 × 15
#>   STATION_NUMBER STATI…¹ PROV_…² REGIO…³ HYD_S…⁴ SED_S…⁵ LATIT…⁶ LONGI…⁷ DRAIN…⁸
#>   <chr>          <chr>   <chr>     <dbl> <chr>   <chr>     <dbl>   <dbl>   <dbl>
#> 1 05OF023        SOUTH … MB            4 ACTIVE  <NA>       49.4   -98.3    34.5
#> # … with 6 more variables: DRAINAGE_AREA_EFFECT <dbl>, RHBN <lgl>,
#> #   REAL_TIME <lgl>, CONTRIBUTOR_ID <int>, OPERATOR_ID <int>, DATUM_ID <int>,
#> #   and abbreviated variable names ¹​STATION_NAME, ²​PROV_TERR_STATE_LOC,
#> #   ³​REGIONAL_OFFICE_ID, ⁴​HYD_STATUS, ⁵​SED_STATUS, ⁶​LATITUDE, ⁷​LONGITUDE,
#> #   ⁸​DRAINAGE_AREA_GROSS

print(param_id, n = nrow(param_id))
#> # A tibble: 42 × 7
#>    Parameter Code  Unit  Name_En                         Name_Fr Descr…¹ Descr…²
#>        <dbl> <chr> <chr> <chr>                           <chr>   <chr>   <chr>  
#>  1        46 HG    m     Water level (primary sensor)    Niveau… Height… Hauteu…
#>  2        16 HG2   m     Water level (secondary sensor,… Niveau… Height… Hauteu…
#>  3        11 HG22  m     Water level (secondary sensor)  Niveau… Height… Hauteu…
#>  4        52 HG3   m     Water level (tertiary sensor, … Niveau… Height… Hauteu…
#>  5        13 HG33  m     Water level (tertiary sensor)   Niveau… Height… Hauteu…
#>  6         3 HGD   m     Water level (daily mean)        Niveau… Provis… niveau…
#>  7        39 HGH   m     Water level (hourly mean)       Niveau… Provis… Niveau…
#>  8        14 HL    m     Elevation, natural lake         Élévat… Elevat… Élévat…
#>  9        42 HR    m     Elevation, lake or reservoir r… Élévat… Elevat… Élévat…
#> 10        17 PA    kPa   Atmospheric pressure            Pressi… Pressu… Pressi…
#> 11        18 PC    mm    Accumulated precipitation       Précip… Precip… Précip…
#> 12        19 PP    mm    Incremental precipitation       Précip… Precip… Précip…
#> 13        47 QR    m3/s  Discharge (primary sensor deri… Debit … Discha… Débit …
#> 14         7 QR2   m3/s  Discharge (secondary sensor de… Debit … Discha… Debit,…
#> 15        10 QR3   m3/s  Discharge (tertiary sensor der… Debit … Discha… Debit,…
#> 16         6 QRD   m3/s  Discharge (daily mean)          Débit … Provis… Débit …
#> 17        40 QRH   m3/s  Discharge (hourly mean)         Débit … Provis… Débit …
#> 18         8 QRS   m3/s  Discharge (sensor)              Debit … Discha… Débit …
#> 19        50 SD    cm    Snow depth                      Épaiss… Snow, … Neige,…
#> 20        51 SF    cm    Snow depth, new snowfall        Épaiss… Snow, … Neige,…
#> 21         1 TA    °C    Air temperature                 Tempér… Temper… Tempér…
#> 22         5 TW    °C    Water temperature               Tempér… Temper… Tempér…
#> 23        41 TW2   °C    Secondary water temperature     Tempér… Temper… Tempér…
#> 24        34 UD    deg   Wind direction                  Direct… Wind, … Vent, …
#> 25        35 US    m/s   Wind speed                      Vitess… Wind, … Vent, …
#> 26         2 VB    V     Battery voltage                 Tensio… Batter… Tensio…
#> 27        20 WB    RFU   Blue-green algae                Algues… Water,… Eau, a…
#> 28        21 WC    S     Conductance                     Conduc… Water,… Eau, c…
#> 29        26 WLA   mg/l  Total dissolved solids          Matièr… Water,… Eau, m…
#> 30        43 WNB   mg/l  Dissolved nitrate               Nitrat… Water,… Eau, n…
#> 31        22 WO    mg/l  Dissolved oxygen                Oxygèn… Water,… Eau, o…
#> 32        24 WP    <NA>  pH                              pH      Water,… Eau, pH
#> 33        25 WT    NTU   Turbidity                       Turbid… Water,… Eau, t…
#> 34         9 WV    m/s   Water velocity                  Vitess… Water,… Eau, v…
#> 35        37 WVX   m/s   Water velocity, x               Vitess… Water,… Eau, v…
#> 36        38 WVY   m/s   Water velocity, y               Vitess… Water,… Eau, v…
#> 37        23 WX    %     Oxygen saturation               Satura… Water,… Eau, o…
#> 38        49 WY    ug/L  Chlorophyll                     Chloro… Water,… Eau, c…
#> 39        28 XR    %     Relative humidity               Humidi… Humidi… Humidi…
#> 40        36 YE    m     Cell end                        Fin de… Distan… Distan…
#> 41         4 YI    °C    Internal equipment temperature  Tempér… Intern… Tempér…
#> 42        12 YP    psi   Tank pressure                   Pressi… Tank p… Pressi…
#> # … with abbreviated variable names ¹​Description_En, ²​Description_Fr

dat <- realtime_ws(
  "05OF023",
  parameters = 46,
  start_date = as.Date("2021-01-01"),
  end_date = Sys.Date()
)
#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#>   dat <- vroom(...)
#>   problems(dat)
#> All station successfully retrieved
#> All parameters successfully retrieved
range(dat$Date)
#> [1] "2022-03-01 06:05:00 UTC" "2023-03-09 17:25:00 UTC"

So that means the ECCC has not validated the data yet. Anything that isn't in HYDAT is not validated and is considered provisional. They have made gains at making some of that data accessible but not everything. AFAIK that data ins't available programmatically right now.

alex-koiter commented 1 year ago

That makes sense. Thanks!