AlisonLanski / IPEDSuploadables

Producing uploadable txt files for IPEDS reporting, one submission at a time
https://alisonlanski.github.io/IPEDSuploadables/
Other
8 stars 5 forks source link

tidyverse update breaks OM part D #84

Closed AlisonLanski closed 2 years ago

AlisonLanski commented 2 years ago

Reported by Calumet College of St. Joseph:

I have received errors running the outcomes measure report with the sample data. Running produce_om_report() with part=”all” or part=”d” produces the following errors:

produce_om_report(om_students, part = "ALL", format = "uploadable")

Error in dplyr::mutate():

! Problem while computing `..1 =

dplyr::across(dplyr::everything(), ~tidyr::replace_na(.x, 0))`.

Caused by error in across():

! Problem while computing column UNITID.

Caused by error in stop_vctrs():

! Can't convert replace to match type of data .

Run rlang::last_error() to see where the error occurred.

Here is the output of rlang::last_error()

<error/dplyr:::mutate_error>

Error in dplyr::mutate():

! Problem while computing `..1 =

dplyr::across(dplyr::everything(), ~tidyr::replace_na(.x, 0))`.

Caused by error in across():

! Problem while computing column UNITID.

Caused by error in stop_vctrs():

! Can't convert replace to match type of data .


Backtrace:

  1. IPEDSuploadables::produce_om_report(...)
  1. tidyr:::replace_na.default(UNITID, 0)

  2. vctrs::vec_assign(data, missing, replace, x_arg = "data", value_arg = "replace")

  3. vctrs <fn>()

  4. vctrs::vec_default_cast(...)

  5. vctrs::stop_incompatible_cast(...)

  6. vctrs::stop_incompatible_type(...)

  7. vctrs:::stop_incompatible(...)

  8. vctrs:::stop_vctrs(...)

Run rlang::last_trace() to see the full context.

The output of rlang::last_trace()

rlang::last_trace()

<error/dplyr:::mutate_error>

Error in dplyr::mutate():

! Problem while computing `..1 =

dplyr::across(dplyr::everything(), ~tidyr::replace_na(.x, 0))`.

Caused by error in across():

! Problem while computing column UNITID.

Caused by error in stop_vctrs():

! Can't convert replace to match type of data .


Backtrace:

  1. ├─IPEDSuploadables::produce_om_report(...)

  2. │ ├─IPEDSuploadables::write_report(...)

  3. │ └─IPEDSuploadables::make_om_part_D(df = students)

  4. │ └─... %>% ...

  5. ├─dplyr::transmute(...)

  6. ├─dplyr::mutate(...)

  7. ├─dplyr:::mutate.data.frame(...)

  8. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), caller_env = caller_env())

  9. │ ├─base::withCallingHandlers(...)

  1. │ ├─base::withCallingHandlers(...)

  2. │ └─mask$eval_all_mutate(quo)

  3. ├─tidyr::replace_na(UNITID, 0)

  4. └─tidyr:::replace_na.default(UNITID, 0)

  5. └─vctrs::vec_assign(data, missing, replace, x_arg = "data", value_arg = "replace")

  6. └─vctrs <fn>()

  7. └─vctrs::vec_default_cast(...)

  8. └─vctrs::stop_incompatible_cast(...)

  9. └─vctrs::stop_incompatible_type(...)

  10. └─vctrs:::stop_incompatible(...)

  11. └─vctrs:::stop_vctrs(...)

  12. └─rlang::abort(message, class = c(class, "vctrs_error"), ...)

I have a new install of R (4.1.3 “One Push-Up”) and RStudio (2022.02.1+461 "Prairie Trillium") along with 2.3.5 version of your R package on my MacBook Pro. It may mean there has been some breaking change introduced with a newer library? I am a relative novice in R preferring to do most of my work in python/pandas instead.


Followup: On the issue, I did a google search earlier today and found this link https://stackoverflow.com/questions/71227130/replace-na-of-numeric-columns-with-both-numeric-and-character-values-in-r

It looks like the problem could stem from replace_na with a tidy version 1.2.0 change.

AlisonLanski commented 2 years ago

new version of tidyr changed the behavior of replace_na to require a list (different values for different characters types). Can fix this by keeping the replace_na function as-is, but specifying it to run only on the numeric columns instead of all columns (includes character).

AlisonLanski commented 2 years ago

Checked the codebase -- no other uses of replace_na that need to be fixed

AlisonLanski commented 2 years ago

picking the columns with across(where(is.numeric)) requires another import from tidyselect. So instead we'll just designated by column position, and set the column positions immediately before. Will need updating if the columns feeding in ever change.