Recent data not in database #195

alex-koiter closed 1 year ago

alex-koiter commented 1 year ago

Describe the bug Updated the database (published on 2023-01-15) but 2021 data for the stations I am interested in is not available despite being available online and can be manually downloaded

To Reproduce Steps to reproduce the behavior: max(hy_daily_flows(station_number = "05OF023")$Date)

Expected behavior Downloaded database should include 2021 data

Desktop (please complete the following information): R version 4.2.1 (2022-06-23) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 22.04.2 LTS tidyhydat_0.5.9

boshek commented 1 year ago

Have a look here:

I am almost finished this and ready to merge. You should be able to get at least some of the data you want. Plus you can be a beta tester for me :)

alex-koiter commented 1 year ago

I am not sure how relates to my issue. I did update 'tidyhydat' but it made no difference, I wonder if the issue is that the 2021 flow data for station "05OF023" is not in the database. Perhaps this is a Water Survey of Canada issue and not a 'tidyhydat' issue.

boshek commented 1 year ago

So my hope was that the webservice could stretch back further than it does. So eg:


dat <- realtime_ws(
  parameters = 46,
  start_date = as.Date("2021-01-01"),
  end_date = Sys.Date()
#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#>   dat <- vroom(...)
#>   problems(dat)
#> All station successfully retrieved
#> All parameters successfully retrieved
#> [1] "2022-03-01 06:05:00 UTC" "2023-03-09 17:25:00 UTC"

So that means the ECCC has not validated the data yet. Anything that isn't in HYDAT is not validated and is considered provisional. They have made gains at making some of that data accessible but not everything. AFAIK that data ins't available programmatically right now.

alex-koiter commented 1 year ago

That makes sense. Thanks!