bcgov / bcgov-r-geo-workshop

Some lessons & resources supporting an R geospatial workshop & hackathon
Other
26 stars 10 forks source link

Fix character NA's in tidyhydat::hy_stations() #3

Closed boshek closed 4 years ago

boshek commented 4 years ago

Missing values are denoted by a special value denoted by NA. However it is also possible for NA to be present as a character vector like this "NA". In the latter case, R does not know that this is actually a missing value. So this is problematic if you would like to find all missing values using a standard approach in R:

library(tidyhydat)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

hy_stations() %>% 
  filter(is.na(SED_STATUS))
#>   Queried from version of HYDAT released on 2019-07-17
#>    Observations:                      0
#>    Jurisdictions: 
#>    Station(s) returned:               0
#>    Stations requested but not returned: 
#>     All stations returned.
#> # A tibble: 0 x 15
#> # ... with 15 variables: STATION_NUMBER <chr>, STATION_NAME <chr>,
#> #   PROV_TERR_STATE_LOC <chr>, REGIONAL_OFFICE_ID <dbl>, HYD_STATUS <chr>,
#> #   SED_STATUS <chr>, LATITUDE <dbl>, LONGITUDE <dbl>,
#> #   DRAINAGE_AREA_GROSS <dbl>, DRAINAGE_AREA_EFFECT <dbl>, RHBN <lgl>,
#> #   REAL_TIME <lgl>, CONTRIBUTOR_ID <int>, OPERATOR_ID <int>,
#> #   DATUM_ID <int>

To fix this issue one would need to go into the source code for hy_stations and define the NA's to be true missing values. The issue in the tidyhydat repo is here: https://github.com/ropensci/tidyhydat/issues/125