Closed kaitejohnson closed 1 month ago
@kaitejohnson, your reprex runs and provides the expected error on my computer as well.
I'm not sure what is causing the unexpected error, so haven't figured out how to make a reprex. Below is an example error I get, which goes away when the site and lab columns are added.
I tried your reprex with a data frame instead of a tibble as an input, but that was not the problem. I tried it with additional junk columns with NA values, but that also was not the problem. I tried adding the column names from my data set to your reprex, with NA values, but that also did not generate the error. I also tried adding a second row to the reprex, to see if it was a dimensionality thing, but that ran as expected also. So I'm not sure what in the larger data set is triggering it.
Thanks for sharing this! This is super helpful @akeyel
I suspect that the error is coming from the way that we are pasting together the error messaging, though I am very confused as to why it would differ from one dataset to another...
We use the checkmate package to check that required column names (passed to the check_names()
function as a vector of strings) are in the dataset. This either returns TRUE, or an error message with {}
that indicates the columns that are missing. Which is a really helpful error message, but the brackets are problematic!
When passing that to cli::cli_abort()
it uses glue
under the hood, and glue
treats brackets as special characters (it expects them to be replaced by variables). So we wrote this utility function, autoescape_brackets()
that replaces the {
with {{
which I believe should allow it to be rendered properly...
To see if this is the problem, I am going to try removing the checkmate error message in this PR. Would you mind reinstalling the package on this branch and then testing your dataset? Or you could send over a fake dataset and I can test myself! Not sure why the reprex isn't reproducing it!
Ok after seeing @dylanhmorris's suggestion in #98 I think the fix is probably just to get rid of the brackets.
I rewrote the autoescape_brackets()
function to replace brackets with an empty string. When I take the ww_data
and remove the site column:
ww_data <- wwinference::ww_data
ww_data_test <- ww_data |>dplyr::select(-site)
assert_req_ww_cols_present(ww_data_test, conc_col_name = "log_genome_copies_per_ml", lod_col_name = "log_lod")
Error: ! Required columns are missing from the wastewater data.
Names must include the elements 'log_genome_copies_per_ml','log_lod','date','site','lab','site_pop', but is missing elements 'site'
This is aimed at addressing #98. It does the following:
check_req_ww_cols_present
toassert_req_ww_cols_present
for consistency withcheckmate()
which uses assert to throw an error if condition isn't met (whereas check will return TRUE if met and a message (but not an errror) if not met).site
from the testing data. This returns the desired error message:Error: ! Required columns are missing from the wastewater data. Names must include the elements {'log_genome_copies_per_ml','lod','date','site','lab','site_pop'}, but is missing elements site}
Unfortunately, I wasn't able to recreate the problem originally described by @akeyel
Here's a reprex of what currently happens if I exclude site:
@akeyel Would you be able to share a reprex to generate the error you were seeing when using this function?