lwheinsberg / dbGaPCheckup

Easy checks for data integrity and proper formatting of the dbGaP subject phenotype data set and data dictionary.
https://lwheinsberg.github.io/dbGaPCheckup/index.html
3 stars 2 forks source link

Apply `na_if()` to numeric columns #1

Closed DavisVaughan closed 1 year ago

DavisVaughan commented 1 year ago

This PR makes your package compatible with the next version of dplyr:

na_if() now uses vctrs. It previously accidentally allowed you to apply it to an entire data frame at once, but this was never the actual intention of this function, it is meant to be applied to a single column at a time. Applying it to a data frame like this now errors, so I've updated your usage to correctly apply it to only your numeric columns.

We plan to submit dplyr 1.1.0 on January 27th.

This should be compatible with both dev and CRAN dplyr. It would help us out if you could go ahead and send a patch version of your package to CRAN ahead of time! Thanks!

lwheinsberg commented 1 year ago

Thank you so very much for alerting us to this issue --- and taking the time to put together the suggested patch. Our intention is for the na_if() function to be applied to all columns (and not just numeric as suggested). Therefore, to maintain dplyr compatibility with your update, we have fixed this issue throughout using:

dataset_na <- DS.data
  for (value in na.omit(non.NA.missing.codes)) {
    dataset_na <- dataset_na %>%
      mutate(across(everything(), ~na_if(.x, value)))
  }

Our changes have been submitted to CRAN.