njtierney / naniar

Tidy data structures, summaries, and visualisations for missing data
http://naniar.njtierney.com/
Other
649 stars 54 forks source link

`shadow_long` throws an error when gathering variables. #314

Closed siavash-babaei closed 1 year ago

siavash-babaei commented 1 year ago

shadow_long throws an error when gathering variables: pivot_longer cannot combine variables!!!

# Library for dealing with missing values 
library(naniar)

# Load `oceanbuoys` data
data("oceanbuoys")

# Impute the mean value and track the imputations
ocean_imp_mean <- oceanbuoys |>
  naniar::nabular(only_miss = TRUE) |>
  naniar::impute_mean_all() |>
  naniar::add_label_shadow()

# Gather the imputed data: Throws an error
ocean_imp_mean |>
  naniar::shadow_long(humidity, air_temp_c)
#> Error in `tidyr::pivot_longer()`:
#> ! Can't combine `year` <double> and `any_missing` <character>.
#> Backtrace:
#>      ▆
#>   1. ├─naniar::shadow_long(ocean_imp_mean, humidity, air_temp_c)
#>   2. │ ├─tidyr::pivot_longer(...)
#>   3. │ └─tidyr:::pivot_longer.data.frame(...)
#>   4. │   └─tidyr::pivot_longer_spec(...)
#>   5. │     └─vctrs::vec_ptype_common(...)
#>   6. └─vctrs (local) `<fn>`()
#>   7.   └─vctrs::vec_default_ptype2(...)
#>   8.     ├─base::withRestarts(...)
#>   9.     │ └─base (local) withOneRestart(expr, restarts[[1L]])
#>  10.     │   └─base (local) doWithOneRestart(return(expr), restart)
#>  11.     └─vctrs::stop_incompatible_type(...)
#>  12.       └─vctrs:::stop_incompatible(...)
#>  13.         └─vctrs:::stop_vctrs(...)
#>  14.           └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = vctrs_error_call(call))

# Gather the imputed data: Works like a charm 
ocean_imp_mean |>
  tidyr::pivot_longer(
    cols      = c("humidity", "air_temp_c"),
    names_to  = "variable",
    values_to = "value",
  ) |>
  tidyr::pivot_longer(
    cols      = c("humidity_NA", "air_temp_c_NA"),
    names_to  = "variable_NA",
    values_to = "value_NA",
  )
#> # A tibble: 2,944 × 12
#>     year latitude longit…¹ sea_t…² wind_ew wind_ns sea_t…³ any_m…⁴ varia…⁵ value
#>    <dbl>    <dbl>    <dbl>   <dbl>   <dbl>   <dbl> <fct>   <chr>   <chr>   <dbl>
#>  1  1997        0     -110    27.6   -6.40    5.40 !NA     Not Mi… humidi…  79.6
#>  2  1997        0     -110    27.6   -6.40    5.40 !NA     Not Mi… humidi…  79.6
#>  3  1997        0     -110    27.6   -6.40    5.40 !NA     Not Mi… air_te…  27.1
#>  4  1997        0     -110    27.6   -6.40    5.40 !NA     Not Mi… air_te…  27.1
#>  5  1997        0     -110    27.5   -5.30    5.30 !NA     Not Mi… humidi…  75.8
#>  6  1997        0     -110    27.5   -5.30    5.30 !NA     Not Mi… humidi…  75.8
#>  7  1997        0     -110    27.5   -5.30    5.30 !NA     Not Mi… air_te…  27.0
#>  8  1997        0     -110    27.5   -5.30    5.30 !NA     Not Mi… air_te…  27.0
#>  9  1997        0     -110    27.6   -5.10    4.5  !NA     Not Mi… humidi…  76.5
#> 10  1997        0     -110    27.6   -5.10    4.5  !NA     Not Mi… humidi…  76.5
#> # … with 2,934 more rows, 2 more variables: variable_NA <chr>, value_NA <fct>,
#> #   and abbreviated variable names ¹​longitude, ²​sea_temp_c, ³​sea_temp_c_NA,
#> #   ⁴​any_missing, ⁵​variable
njtierney commented 1 year ago

Thanks for posting this - this is a new error introduced in the latest release, I confirm that I get the same error:

# Library for dealing with missing values 
library(naniar)

# Load `oceanbuoys` data
data("oceanbuoys")

# Impute the mean value and track the imputations
ocean_imp_mean <- oceanbuoys |>
  naniar::nabular(only_miss = TRUE) |>
  naniar::impute_mean_all() |>
  naniar::add_label_shadow()

# Gather the imputed data: Throws an error
ocean_imp_mean |>
  naniar::shadow_long(humidity, air_temp_c)
#> Error in `tidyr::pivot_longer()`:
#> ! Can't combine `year` <double> and `any_missing` <character>.

#> Backtrace:
#>      ▆
#>   1. ├─naniar::shadow_long(ocean_imp_mean, humidity, air_temp_c)
#>   2. │ ├─tidyr::pivot_longer(...)
#>   3. │ └─tidyr:::pivot_longer.data.frame(...)
#>   4. │   └─tidyr::pivot_longer_spec(...)
#>   5. │     └─vctrs::vec_ptype_common(...)
#>   6. └─vctrs (local) `<fn>`()
#>   7.   └─vctrs::vec_default_ptype2(...)
#>   8.     ├─base::withRestarts(...)
#>   9.     │ └─base (local) withOneRestart(expr, restarts[[1L]])
#>  10.     │   └─base (local) doWithOneRestart(return(expr), restart)
#>  11.     └─vctrs::stop_incompatible_type(...)
#>  12.       └─vctrs:::stop_incompatible(...)
#>  13.         └─vctrs:::stop_vctrs(...)
#>  14.           └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = vctrs_error_call(call))

Created on 2023-03-30 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.2.2 (2022-10-31) #> os macOS Ventura 13.2 #> system aarch64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Australia/Hobart #> date 2023-03-30 #> pandoc 2.19.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> cli 3.6.0 2023-01-09 [1] CRAN (R 4.2.0) #> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.2.0) #> digest 0.6.31 2022-12-11 [1] CRAN (R 4.2.0) #> dplyr 1.1.0 2023-01-29 [1] CRAN (R 4.2.1) #> evaluate 0.20 2023-01-17 [1] CRAN (R 4.2.0) #> fansi 1.0.4 2023-01-22 [1] CRAN (R 4.2.0) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.0) #> fs 1.6.1 2023-02-06 [1] CRAN (R 4.2.0) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.0) #> ggplot2 3.4.1 2023-02-10 [1] CRAN (R 4.2.0) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0) #> gtable 0.3.1 2022-09-01 [1] CRAN (R 4.2.0) #> htmltools 0.5.4 2022-12-07 [1] CRAN (R 4.2.0) #> knitr 1.42 2023-01-25 [1] CRAN (R 4.2.0) #> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.0) #> naniar * 1.0.0 2023-02-02 [1] CRAN (R 4.2.0) #> pillar 1.8.1 2022-08-19 [1] CRAN (R 4.2.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0) #> purrr 1.0.1 2023-01-10 [1] CRAN (R 4.2.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.2.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.2.0) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.2.0) #> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.2.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.2.0) #> rlang 1.0.6 2022-09-24 [1] CRAN (R 4.2.0) #> rmarkdown 2.20 2023-01-19 [1] CRAN (R 4.2.0) #> rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.2.0) #> scales 1.2.1 2022-08-20 [1] CRAN (R 4.2.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0) #> styler 1.9.0 2023-01-15 [1] CRAN (R 4.2.0) #> tibble 3.1.8 2022-07-22 [1] CRAN (R 4.2.0) #> tidyr 1.3.0 2023-01-24 [1] CRAN (R 4.2.0) #> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.2.0) #> utf8 1.2.3 2023-01-31 [1] CRAN (R 4.2.0) #> vctrs 0.5.2 2023-01-23 [1] CRAN (R 4.2.0) #> visdat 0.6.0 2023-02-02 [1] local #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0) #> xfun 0.37 2023-01-31 [1] CRAN (R 4.2.0) #> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.2.0) #> #> [1] /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

Thanks again for posting this, this will be fixed in an upcoming release.

njtierney commented 1 year ago

Thank you again for this @siavash-babaei !

This now works, but by default changes value to character, as that is the safest way to have this always succeed. Otherwise you can specify your own coercion function to transform value values. Here's an example:

library(naniar)
aq_shadow <- nabular(airquality)

shadow_long(aq_shadow)
#> # A tibble: 918 × 4
#>    variable value variable_NA value_NA
#>    <chr>    <chr> <chr>       <fct>   
#>  1 Ozone    41    Ozone_NA    !NA     
#>  2 Solar.R  190   Solar.R_NA  !NA     
#>  3 Wind     7.4   Wind_NA     !NA     
#>  4 Temp     67    Temp_NA     !NA     
#>  5 Month    5     Month_NA    !NA     
#>  6 Day      1     Day_NA      !NA     
#>  7 Ozone    36    Ozone_NA    !NA     
#>  8 Solar.R  118   Solar.R_NA  !NA     
#>  9 Wind     8     Wind_NA     !NA     
#> 10 Temp     72    Temp_NA     !NA     
#> # ℹ 908 more rows

# then filter only on Ozone and Solar.R
shadow_long(aq_shadow, Ozone, Solar.R)
#> # A tibble: 306 × 4
#>    variable value variable_NA value_NA
#>    <chr>    <chr> <chr>       <fct>   
#>  1 Ozone    41    Ozone_NA    !NA     
#>  2 Solar.R  190   Solar.R_NA  !NA     
#>  3 Ozone    36    Ozone_NA    !NA     
#>  4 Solar.R  118   Solar.R_NA  !NA     
#>  5 Ozone    12    Ozone_NA    !NA     
#>  6 Solar.R  149   Solar.R_NA  !NA     
#>  7 Ozone    18    Ozone_NA    !NA     
#>  8 Solar.R  313   Solar.R_NA  !NA     
#>  9 Ozone    <NA>  Ozone_NA    NA      
#> 10 Solar.R  <NA>  Solar.R_NA  NA      
#> # ℹ 296 more rows

# ensure `value` is numeric
shadow_long(aq_shadow, fn_value_transform = as.numeric)
#> # A tibble: 918 × 4
#>    variable value variable_NA value_NA
#>    <chr>    <dbl> <chr>       <fct>   
#>  1 Ozone     41   Ozone_NA    !NA     
#>  2 Solar.R  190   Solar.R_NA  !NA     
#>  3 Wind       7.4 Wind_NA     !NA     
#>  4 Temp      67   Temp_NA     !NA     
#>  5 Month      5   Month_NA    !NA     
#>  6 Day        1   Day_NA      !NA     
#>  7 Ozone     36   Ozone_NA    !NA     
#>  8 Solar.R  118   Solar.R_NA  !NA     
#>  9 Wind       8   Wind_NA     !NA     
#> 10 Temp      72   Temp_NA     !NA     
#> # ℹ 908 more rows
shadow_long(aq_shadow, Ozone, Solar.R, fn_value_transform = as.numeric)
#> # A tibble: 306 × 4
#>    variable value variable_NA value_NA
#>    <chr>    <dbl> <chr>       <fct>   
#>  1 Ozone       41 Ozone_NA    !NA     
#>  2 Solar.R    190 Solar.R_NA  !NA     
#>  3 Ozone       36 Ozone_NA    !NA     
#>  4 Solar.R    118 Solar.R_NA  !NA     
#>  5 Ozone       12 Ozone_NA    !NA     
#>  6 Solar.R    149 Solar.R_NA  !NA     
#>  7 Ozone       18 Ozone_NA    !NA     
#>  8 Solar.R    313 Solar.R_NA  !NA     
#>  9 Ozone       NA Ozone_NA    NA      
#> 10 Solar.R     NA Solar.R_NA  NA      
#> # ℹ 296 more rows

Created on 2023-05-01 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.0 (2023-04-21) #> os macOS Ventura 13.2 #> system aarch64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/Los_Angeles #> date 2023-05-01 #> pandoc 2.19.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0) #> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0) #> digest 0.6.31 2022-12-11 [1] CRAN (R 4.3.0) #> dplyr 1.1.2 2023-04-20 [1] CRAN (R 4.3.0) #> evaluate 0.20 2023-01-17 [1] CRAN (R 4.3.0) #> fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.0) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) #> fs 1.6.2 2023-04-25 [1] CRAN (R 4.3.0) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0) #> ggplot2 3.4.2 2023-04-03 [1] CRAN (R 4.3.0) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) #> gtable 0.3.3 2023-03-21 [1] CRAN (R 4.3.0) #> htmltools 0.5.5 2023-03-23 [1] CRAN (R 4.3.0) #> knitr 1.42 2023-01-25 [1] CRAN (R 4.3.0) #> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0) #> naniar * 1.0.0.9000 2023-05-01 [1] local #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0) #> purrr 1.0.1 2023-01-10 [1] CRAN (R 4.3.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.0) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.3.0) #> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.3.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.3.0) #> rlang 1.1.0 2023-03-14 [1] CRAN (R 4.3.0) #> rmarkdown 2.21 2023-03-26 [1] CRAN (R 4.3.0) #> rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.3.0) #> scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) #> styler 1.9.1 2023-03-04 [1] CRAN (R 4.3.0) #> tibble 3.2.1 2023-03-20 [1] CRAN (R 4.3.0) #> tidyr 1.3.0 2023-01-24 [1] CRAN (R 4.3.0) #> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0) #> utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.0) #> vctrs 0.6.2 2023-04-19 [1] CRAN (R 4.3.0) #> visdat 0.6.0 2023-02-02 [1] CRAN (R 4.3.0) #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.0) #> xfun 0.39 2023-04-20 [1] CRAN (R 4.3.0) #> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0) #> #> [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```