njtierney / ozroaddeaths

Access data from Australian Road Deaths Database
https://njtierney.github.io/ozroaddeaths/
Other
6 stars 5 forks source link

Test if the data frame classes are consistent #32

Open njtierney opened 1 week ago

njtierney commented 1 week ago

E.g.,

classes <- \(x) purrr::map_chr(x, class)

classes(mtcars)
#>       mpg       cyl      disp        hp      drat        wt      qsec        vs 
#> "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" 
#>        am      gear      carb 
#> "numeric" "numeric" "numeric"

# expect_snapshot(classes(...))

Created on 2024-10-21 with reprex v2.1.1

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.4.1 Patched (2024-07-08 r86915) #> os macOS Sonoma 14.5 #> system aarch64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Australia/Melbourne #> date 2024-10-21 #> pandoc 3.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/aarch64/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> cli 3.6.3 2024-06-21 [1] CRAN (R 4.4.0) #> digest 0.6.36 2024-06-23 [1] CRAN (R 4.4.0) #> evaluate 0.24.0 2024-06-10 [1] CRAN (R 4.4.0) #> fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.4.0) #> fs 1.6.4.9000 2024-06-26 [1] Github (r-lib/fs@714990b) #> glue 1.7.0 2024-01-09 [1] CRAN (R 4.4.0) #> htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0) #> knitr 1.48 2024-07-07 [1] CRAN (R 4.4.0) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.4.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.4.0) #> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.4.0) #> reprex 2.1.1 2024-07-06 [1] CRAN (R 4.4.0) #> rlang 1.1.4 2024-06-04 [1] CRAN (R 4.4.0) #> rmarkdown 2.27 2024-05-17 [1] CRAN (R 4.4.0) #> rstudioapi 0.16.0 2024-03-24 [1] CRAN (R 4.4.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.4.0) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.4.0) #> withr 3.0.1 2024-07-31 [1] CRAN (R 4.4.0) #> xfun 0.46 2024-07-18 [1] CRAN (R 4.4.0) #> yaml 2.3.10 2024-07-26 [1] CRAN (R 4.4.0) #> #> [1] /Users/nick/Library/R/arm64/4.4/library #> [2] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
johann-wagner commented 2 days ago

Unfortunately, due to the time and date_time columns being hms/difftime and POSIXct/POSIXt, the class() function returning two class types. This results in the following error.

library(ozroaddeaths)

purrr::map_chr(oz_road_fatal_crash(), class)
#> Error in `purrr::map_chr()`:
#> ℹ In index: 6.
#> ℹ With name: time.
#> Caused by error:
#> ! Result must be length 1, not 2.

Created on 2024-10-27 with reprex v2.1.1

Thankfully, I've just created a workaround logic to account for these cases:

library(ozroaddeaths)

classes <- function(data_frame) {
  purrr::map_chr(data_frame, function(x) {
    class_output <- class(x)
    if (length(class_output) > 1) {
      paste(class_output, collapse = "/")
    } else {
      class_output
    }
  })
}

classes(oz_road_fatal_crash())
#>          crash_id      n_fatalities             month              year 
#>         "numeric"         "numeric"         "numeric"         "numeric" 
#>           weekday              time             state        crash_type 
#>       "character"    "hms/difftime"       "character"       "character" 
#>               bus heavy_rigid_truck articulated_truck       speed_limit 
#>       "character"       "character"       "character"         "numeric" 
#>              date         date_time 
#>            "Date"  "POSIXct/POSIXt"

Created on 2024-10-27 with reprex v2.1.1

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.4.1 (2024-06-14 ucrt) #> os Windows 10 x64 (build 19045) #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate English_Australia.utf8 #> ctype English_Australia.utf8 #> tz Australia/Sydney #> date 2024-10-27 #> pandoc 3.1.11 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> bit 4.5.0 2024-09-20 [1] CRAN (R 4.4.1) #> bit64 4.5.2 2024-09-22 [1] CRAN (R 4.4.1) #> cli 3.6.3 2024-06-21 [1] CRAN (R 4.4.1) #> crayon 1.5.3 2024-06-20 [1] CRAN (R 4.4.1) #> curl 5.2.3 2024-09-20 [1] CRAN (R 4.4.1) #> digest 0.6.37 2024-08-19 [1] CRAN (R 4.4.1) #> dplyr 1.1.4 2023-11-17 [1] CRAN (R 4.4.1) #> evaluate 1.0.1 2024-10-10 [1] CRAN (R 4.4.1) #> fansi 1.0.6 2023-12-08 [1] CRAN (R 4.4.1) #> fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.4.1) #> fs 1.6.4 2024-04-25 [1] CRAN (R 4.4.1) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.4.1) #> glue 1.8.0 2024-09-30 [1] CRAN (R 4.4.1) #> hms 1.1.3 2023-03-21 [1] CRAN (R 4.4.1) #> htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.4.1) #> janitor 2.2.0 2023-02-02 [1] CRAN (R 4.4.1) #> knitr 1.48 2024-07-07 [1] CRAN (R 4.4.1) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.4.1) #> lubridate 1.9.3 2023-09-27 [1] CRAN (R 4.4.1) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.4.1) #> ozroaddeaths * 0.0.3.9000 2024-10-26 [1] local #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.4.1) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.4.1) #> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.4.1) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.4.1) #> readr 2.1.5 2024-01-10 [1] CRAN (R 4.4.1) #> reprex 2.1.1 2024-07-06 [1] CRAN (R 4.4.1) #> rlang 1.1.4 2024-06-04 [1] CRAN (R 4.4.1) #> rmarkdown 2.28 2024-08-17 [1] CRAN (R 4.4.1) #> rstudioapi 0.17.1 2024-10-22 [1] CRAN (R 4.4.1) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.4.1) #> snakecase 0.11.1 2023-08-27 [1] CRAN (R 4.4.1) #> stringi 1.8.4 2024-05-06 [1] CRAN (R 4.4.0) #> stringr 1.5.1 2023-11-14 [1] CRAN (R 4.4.1) #> tibble 3.2.1 2023-03-20 [1] CRAN (R 4.4.1) #> tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.4.1) #> timechange 0.3.0 2024-01-18 [1] CRAN (R 4.4.1) #> tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.4.1) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.4.1) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.4.1) #> vroom 1.6.5 2023-12-05 [1] CRAN (R 4.4.1) #> withr 3.0.1 2024-07-31 [1] CRAN (R 4.4.1) #> xfun 0.48 2024-10-03 [1] CRAN (R 4.4.1) #> yaml 2.3.10 2024-07-26 [1] CRAN (R 4.4.1) #> #> [1] C:/Users/Johan/AppData/Local/R/win-library/4.4 #> [2] C:/Program Files/R/R-4.4.1/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
njtierney commented 1 day ago

Thanks for this, @johann-wagner !

That's a good workaround - I did something similar (https://github.com/ropensci/visdat/blob/23687c85712460eef23ea629870d76e9df7e490e/R/internals.R#L14) for visdat - in this case we don't actually need it to be a character vector, so we can just get away with using a list, which is what I had here: https://github.com/njtierney/ozroaddeaths/pull/44

njtierney commented 1 day ago

But I see that your tests are actually passing and mine are failing because I added some code and forgot to add an extra curly brace. I think I forgot to sync up the branches, so I'll close my PR and accept yours :)